The power of a raw Lua interpreter to manipulate strings is quite limited. A program can create string literals and concatenate them. But it cannot extract a substring, check its size, or examine its contents. The full power to manipulate strings in Lua comes from its string library.
Some functions in the string library are quite simple:
string.len(s)
returns the length of a string s
. string.rep(s, n)
returns the string s
repeated n
times. You can create a string with 1M bytes (for tests, for instance) with string.rep("a", 2^20)
. string.lower(s)
returns a copy of s
with the upper-case letters converted to lower case; all other characters in the string are not changed (string.upper
converts to upper case). As a typical use, if you want to sort an array of strings regardless of case, you may write something liketable.sort(a, function (a, b) return string.lower(a) < string.lower(b) end)Both
string.upper
and string.lower
follow the current locale. Therefore, if you work with the European Latin-1 locale, the expressionstring.upper("ação")results in
"AÇÃO"
.
The call
string.sub(s,i,j)
extracts a piece of the string s
, from the i
-th to the j
-th character inclusive. In Lua, the first character of a string has index 1. You can also use negative indices, which count from the end of the string: The index -1 refers to the last character in a string, -2 to the previous one, and so on. Therefore, the call string.sub(s, 1, j)
gets a prefix of the string s
with length j
; string.sub(s, j, -1)
gets a suffix of the string, starting at the j
-th character (if you do not provide a third argument, it defaults to -1, so we could write the last call as string.sub(s, j)
); and string.sub(s, 2, -2)
returns a copy of the string s
with the first and last characters removed:s = "[in brackets]" print(string.sub(s, 2, -2)) --> in brackets
Remember that strings in Lua are immutable. The
string.sub
function, like any other function in Lua, does not change the value of a string, but returns a new string. A common mistake is to write something likestring.sub(s, 2, -2)and to assume that the value of
s
will be modified. If you want to modify the value of a variable, you must assign the new value to the variable:s = string.sub(s, 2, -2)
The
string.char
and string.byte
functions convert between characters and their internal numeric representations. The function string.char
gets zero or more integers, converts each one to a character, and returns a string concatenating all those characters. The function string.byte(s, i)
returns the internal numeric representation of the i
-th character of the string s
; the second argument is optional, so that a call string.byte(s)
returns the internal numeric representation of the first (or single) character of s
. In the following examples, we assume that characters are represented in ASCII:print(string.char(97)) --> a i = 99; print(string.char(i, i+1, i+2)) --> cde print(string.byte("abc")) --> 97 print(string.byte("abc", 2)) --> 98 print(string.byte("abc", -1)) --> 99In the last line, we used a negative index to access the last character of the string.
The function
string.format
is a powerful tool when formatting strings, typically for output. It returns a formatted version of its variable number of arguments following the description given by its first argument, the so-called format string. The format string has rules similar to those of the printf
function of standard C: It is composed of regular text and directives, which control where and how each argument must be placed in the formatted string. A simple directive is the character `%
´ plus a letter that tells how to format the argument: `d
´ for a decimal number, `x
´ for hexadecimal, `o
´ for octal, `f
´ for a floating-point number, `s
´ for strings, plus other variants. Between the `%
´ and the letter, a directive can include other options, which control the details of the format, such as the number of decimal digits of a floating-point number:print(string.format("pi = %.4f", PI)) --> pi = 3.1416 d = 5; m = 11; y = 1990 print(string.format("%02d/%02d/%04d", d, m, y)) --> 05/11/1990 tag, title = "h1", "a title" print(string.format("<%s>%s</%s>", tag, title, tag)) --> <h1>a title</h1>In the first example, the
%.4f
means a floating-point number with four digits after the decimal point. In the second example, the %02d
means a decimal number (`d
´), with at least two digits and zero padding; the directive %2d
, without the zero, would use blanks for padding. For a complete description of those directives, see the Lua reference manual. Or, better yet, see a C manual, as Lua calls the standard C libraries to do the hard work here.
No comments:
Post a Comment