The power of a raw Lua interpreter to manipulate strings is quite limited. A program can create string literals and concatenate them. But it cannot extract a substring, check its size, or examine its contents. The full power to manipulate strings in Lua comes from its string library.
Some functions in the string library are quite simple:
string.len(s) returns the length of a string s. string.rep(s, n) returns the string srepeated n times. You can create a string with 1M bytes (for tests, for instance) with string.rep("a", 2^20). string.lower(s) returns a copy of s with the upper-case letters converted to lower case; all other characters in the string are not changed (string.upper converts to upper case). As a typical use, if you want to sort an array of strings regardless of case, you may write something like table.sort(a, function (a, b)
return string.lower(a) < string.lower(b)
end)
Both string.upper and string.lower follow the current locale. Therefore, if you work with the European Latin-1 locale, the expression string.upper("ação")
results in "AÇÃO".
The call
string.sub(s,i,j) extracts a piece of the string s, from the i-th to the j-th character inclusive. In Lua, the first character of a string has index 1. You can also use negative indices, which count from the end of the string: The index -1 refers to the last character in a string, -2 to the previous one, and so on. Therefore, the call string.sub(s, 1, j) gets a prefix of the string s with length j; string.sub(s, j, -1) gets a suffix of the string, starting at the j-th character (if you do not provide a third argument, it defaults to -1, so we could write the last call as string.sub(s, j)); and string.sub(s, 2, -2) returns a copy of the string s with the first and last characters removed: s = "[in brackets]"
print(string.sub(s, 2, -2)) --> in brackets
Remember that strings in Lua are immutable. The
string.sub function, like any other function in Lua, does not change the value of a string, but returns a new string. A common mistake is to write something likestring.sub(s, 2, -2)and to assume that the value of
s will be modified. If you want to modify the value of a variable, you must assign the new value to the variable:s = string.sub(s, 2, -2)
The
string.char and string.byte functions convert between characters and their internal numeric representations. The function string.char gets zero or more integers, converts each one to a character, and returns a string concatenating all those characters. The function string.byte(s, i) returns the internal numeric representation of the i-th character of the string s; the second argument is optional, so that a call string.byte(s) returns the internal numeric representation of the first (or single) character of s. In the following examples, we assume that characters are represented in ASCII: print(string.char(97)) --> a
i = 99; print(string.char(i, i+1, i+2)) --> cde
print(string.byte("abc")) --> 97
print(string.byte("abc", 2)) --> 98
print(string.byte("abc", -1)) --> 99
In the last line, we used a negative index to access the last character of the string.
The function
string.format is a powerful tool when formatting strings, typically for output. It returns a formatted version of its variable number of arguments following the description given by its first argument, the so-called format string. The format string has rules similar to those of the printf function of standard C: It is composed of regular text and directives, which control where and how each argument must be placed in the formatted string. A simple directive is the character `%´ plus a letter that tells how to format the argument: `d´ for a decimal number, `x´ for hexadecimal, `o´ for octal, `f´ for a floating-point number, `s´ for strings, plus other variants. Between the `%´ and the letter, a directive can include other options, which control the details of the format, such as the number of decimal digits of a floating-point number: print(string.format("pi = %.4f", PI)) --> pi = 3.1416
d = 5; m = 11; y = 1990
print(string.format("%02d/%02d/%04d", d, m, y))
--> 05/11/1990
tag, title = "h1", "a title"
print(string.format("<%s>%s</%s>", tag, title, tag))
--> <h1>a title</h1>
In the first example, the %.4f means a floating-point number with four digits after the decimal point. In the second example, the %02dmeans a decimal number (`d´), with at least two digits and zero padding; the directive %2d, without the zero, would use blanks for padding. For a complete description of those directives, see the Lua reference manual. Or, better yet, see a C manual, as Lua calls the standard C libraries to do the hard work here.
No comments:
Post a Comment