Module:TableTools

From WikiProjectMed
Jump to navigation Jump to search

This module includes a number of functions for dealing with Lua tables. It is a meta-module, meant to be called from other Lua modules, and should not be called directly from #invoke.

Loading the module

To use any of the functions, first you must load the module.

local TableTools = require('Module:TableTools')

isPositiveInteger

TableTools.isPositiveInteger(value)

Returns true if value is a positive integer, and false if not. Although it doesn't operate on tables, it is included here as it is useful for determining whether a given table key is in the array part or the hash part of a table.

isNan

TableTools.isNan(value)

Returns true if value is a NaN value, and false if not. Although it doesn't operate on tables, it is included here as it is useful for determining whether a value can be a valid table key. (Lua will generate an error if a NaN value is used as a table key.)

shallowClone

TableTools.shallowClone(t)

Returns a clone of a table. The value returned is a new table, but all subtables and functions are shared. Metamethods are respected, but the returned table will have no metatable of its own. If you want to make a new table with no shared subtables and with metatables transferred, you can use mw.clone instead. If you want to make a new table with no shared subtables and without metatables transferred, use deepCopy with the noMetatable option.

removeDuplicates

TableTools.removeDuplicates(t)

Removes duplicate values from an array. This function is only designed to work with standard arrays: keys that are not positive integers are ignored, as are all values after the first nil value. (For arrays containing nil values, you can use compressSparseArray first.) The function tries to preserve the order of the array: the earliest non-unique value is kept, and all subsequent duplicate values are removed. For example, for the table {5, 4, 4, 3, 4, 2, 2, 1} removeDuplicates will return {5, 4, 3, 2, 1}

numKeys

TableTools.numKeys(t)

Takes a table t and returns an array containing the numbers of any positive integer keys that have non-nil values, sorted in numerical order. For example, for the table {'foo', nil, 'bar', 'baz', a = 'b'}, numKeys will return {1, 3, 4}.

affixNums

TableTools.affixNums(t, prefix, suffix)

Takes a table t and returns an array containing the numbers of keys with the optional prefix prefix and the optional suffix suffix. For example, for the table {a1 = 'foo', a3 = 'bar', a6 = 'baz'} and the prefix 'a', affixNums will return {1, 3, 6}. All characters in prefix and suffix are interpreted literally.

numData

TableTools.numData(t, compress)

Given a table with keys like "foo1", "bar1", "foo2", and "baz2", returns a table of subtables in the format { [1] = {foo = 'text', bar = 'text'}, [2] = {foo = 'text', baz = 'text'} }. Keys that don't end with an integer are stored in a subtable named "other". The compress option compresses the table so that it can be iterated over with ipairs.

compressSparseArray

TableTools.compressSparseArray(t)

Takes an array t with one or more nil values, and removes the nil values while preserving the order, so that the array can be safely traversed with ipairs. Any keys that are not positive integers are removed. For example, for the table {1, nil, foo = 'bar', 3, 2}, compressSparseArray will return {1, 3, 2}.

sparseIpairs

TableTools.sparseIpairs(t)

This is an iterator function for traversing a sparse array t. It is similar to ipairs, but will continue to iterate until the highest numerical key, whereas ipairs may stop after the first nil value. Any keys that are not positive integers are ignored.

Usually sparseIpairs is used in a generic for loop.

for i, v in TableTools.sparseIpairs(t) do
   -- code block
end

Note that sparseIpairs uses the pairs function in its implementation. Although some table keys appear to be ignored, all table keys are accessed when it is run.

size

TableTools.size(t)

Finds the size of a key/value pair table. For example, for the table {foo = 'foo', bar = 'bar'}, size will return 2. The function will also work on arrays, but for arrays it is more efficient to use the # operator. Note that to find the table size, this function uses the pairs function to iterate through all of the table keys.

keysToList

TableTools.keysToList(t, keySort)

Returns a list of the keys in a table, sorted using either a default comparison function or a custom keySort function, which follows the same rules as the comp function supplied to table.sort.

sortedPairs

TableTools.sortedPairs(t, keySort)

Iterates through a table, with the keys sorted using the keysToList function. If there are only numerical keys, sparseIpairs is probably more efficient.

isArray

TableTools.isArray(t)

Returns true if all keys in the table are consecutive integers starting at 1.

listToSet

TableTools.listToSet(arr)

Creates a set from the array part of the table arr. Indexing the set by any of the values in arr returns true.

local set = TableTools.listToSet { "a", "b", "c" }
assert(set["a"] == true)

invert

TableTools.invert(t)

Transposes the keys and values in an array. For example, invert{ "a", "b", "c" } yields { a = 1, b = 2, c = 3 }.

deepCopy

TableTools.deepCopy(orig, noMetatable, alreadySeen)

Creates a copy of the table orig. As with mw.clone, all values that are not functions are duplicated and the identity of tables is preserved. If noMetatable is true, then the metatable (if any) is not copied. Can copy tables loaded with mw.loadData.

Similar to mw.clone, but mw.clone cannot copy tables loaded with mw.loadData and does not allow metatables not to be copied.

sparseConcat

TableTools.sparseConcat(t, sep)

Concatenates all values in the table that are indexed by a positive integer, in order.

length

TableTools.length(t)

Finds the length of an array, or of a quasi-array with keys such as "data1", "data2", etc. It uses a binary search algorithm to find the length, so as to use as few table lookups as possible.

This algorithm is useful for arrays that use metatables (e.g. frame.args) and for quasi-arrays. For normal arrays, just use the # operator, as it is implemented in C and will be quicker.

inArray

TableTools.inArray(arr, valueToFind)

Returns true if valueToFind is a member of the array arr, and false otherwise.


  1 --[[
  2 ------------------------------------------------------------------------------------
  3 --                               TableTools                                       --
  4 --                                                                                --
  5 -- This module includes a number of functions for dealing with Lua tables.        --
  6 -- It is a meta-module, meant to be called from other Lua modules, and should     --
  7 -- not be called directly from #invoke.                                           --
  8 ------------------------------------------------------------------------------------
  9 --]]
 10 
 11 local libraryUtil = require('libraryUtil')
 12 
 13 local p = {}
 14 
 15 -- Define often-used variables and functions.
 16 local floor = math.floor
 17 local infinity = math.huge
 18 local checkType = libraryUtil.checkType
 19 local checkTypeMulti = libraryUtil.checkTypeMulti
 20 
 21 --[[
 22 ------------------------------------------------------------------------------------
 23 -- isPositiveInteger
 24 --
 25 -- This function returns true if the given value is a positive integer, and false
 26 -- if not. Although it doesn't operate on tables, it is included here as it is
 27 -- useful for determining whether a given table key is in the array part or the
 28 -- hash part of a table.
 29 ------------------------------------------------------------------------------------
 30 --]]
 31 function p.isPositiveInteger(v)
 32 	return type(v) == 'number' and v >= 1 and floor(v) == v and v < infinity
 33 end
 34 
 35 --[[
 36 ------------------------------------------------------------------------------------
 37 -- isNan
 38 --
 39 -- This function returns true if the given number is a NaN value, and false
 40 -- if not. Although it doesn't operate on tables, it is included here as it is
 41 -- useful for determining whether a value can be a valid table key. Lua will
 42 -- generate an error if a NaN is used as a table key.
 43 ------------------------------------------------------------------------------------
 44 --]]
 45 function p.isNan(v)
 46 	return type(v) == 'number' and tostring(v) == '-nan'
 47 end
 48 
 49 --[[
 50 ------------------------------------------------------------------------------------
 51 -- shallowClone
 52 --
 53 -- This returns a clone of a table. The value returned is a new table, but all
 54 -- subtables and functions are shared. Metamethods are respected, but the returned
 55 -- table will have no metatable of its own.
 56 ------------------------------------------------------------------------------------
 57 --]]
 58 function p.shallowClone(t)
 59 	local ret = {}
 60 	for k, v in pairs(t) do
 61 		ret[k] = v
 62 	end
 63 	return ret
 64 end
 65 
 66 --[[
 67 ------------------------------------------------------------------------------------
 68 -- removeDuplicates
 69 --
 70 -- This removes duplicate values from an array. Non-positive-integer keys are
 71 -- ignored. The earliest value is kept, and all subsequent duplicate values are
 72 -- removed, but otherwise the array order is unchanged.
 73 ------------------------------------------------------------------------------------
 74 --]]
 75 function p.removeDuplicates(t)
 76 	checkType('removeDuplicates', 1, t, 'table')
 77 	local isNan = p.isNan
 78 	local ret, exists = {}, {}
 79 	for i, v in ipairs(t) do
 80 		if isNan(v) then
 81 			-- NaNs can't be table keys, and they are also unique, so we don't need to check existence.
 82 			ret[#ret + 1] = v
 83 		else
 84 			if not exists[v] then
 85 				ret[#ret + 1] = v
 86 				exists[v] = true
 87 			end
 88 		end	
 89 	end
 90 	return ret
 91 end			
 92 
 93 --[[
 94 ------------------------------------------------------------------------------------
 95 -- numKeys
 96 --
 97 -- This takes a table and returns an array containing the numbers of any numerical
 98 -- keys that have non-nil values, sorted in numerical order.
 99 ------------------------------------------------------------------------------------
100 --]]
101 function p.numKeys(t)
102 	checkType('numKeys', 1, t, 'table')
103 	local isPositiveInteger = p.isPositiveInteger
104 	local nums = {}
105 	for k, v in pairs(t) do
106 		if isPositiveInteger(k) then
107 			nums[#nums + 1] = k
108 		end
109 	end
110 	table.sort(nums)
111 	return nums
112 end
113 
114 --[[
115 ------------------------------------------------------------------------------------
116 -- affixNums
117 --
118 -- This takes a table and returns an array containing the numbers of keys with the
119 -- specified prefix and suffix. For example, for the table
120 -- {a1 = 'foo', a3 = 'bar', a6 = 'baz'} and the prefix "a", affixNums will
121 -- return {1, 3, 6}.
122 ------------------------------------------------------------------------------------
123 --]]
124 function p.affixNums(t, prefix, suffix)
125 	checkType('affixNums', 1, t, 'table')
126 	checkType('affixNums', 2, prefix, 'string', true)
127 	checkType('affixNums', 3, suffix, 'string', true)
128 
129 	local function cleanPattern(s)
130 		-- Cleans a pattern so that the magic characters ()%.[]*+-?^$ are interpreted literally.
131 		return s:gsub('([%(%)%%%.%[%]%*%+%-%?%^%$])', '%%%1')
132 	end
133 
134 	prefix = prefix or ''
135 	suffix = suffix or ''
136 	prefix = cleanPattern(prefix)
137 	suffix = cleanPattern(suffix)
138 	local pattern = '^' .. prefix .. '([1-9]%d*)' .. suffix .. '$'
139 
140 	local nums = {}
141 	for k, v in pairs(t) do
142 		if type(k) == 'string' then			
143 			local num = mw.ustring.match(k, pattern)
144 			if num then
145 				nums[#nums + 1] = tonumber(num)
146 			end
147 		end
148 	end
149 	table.sort(nums)
150 	return nums
151 end
152 
153 --[[
154 ------------------------------------------------------------------------------------
155 -- numData
156 --
157 -- Given a table with keys like ("foo1", "bar1", "foo2", "baz2"), returns a table
158 -- of subtables in the format 
159 -- { [1] = {foo = 'text', bar = 'text'}, [2] = {foo = 'text', baz = 'text'} }
160 -- Keys that don't end with an integer are stored in a subtable named "other".
161 -- The compress option compresses the table so that it can be iterated over with
162 -- ipairs.
163 ------------------------------------------------------------------------------------
164 --]]
165 function p.numData(t, compress)
166 	checkType('numData', 1, t, 'table')
167 	checkType('numData', 2, compress, 'boolean', true)
168 	local ret = {}
169 	for k, v in pairs(t) do
170 		local prefix, num = mw.ustring.match(tostring(k), '^([^0-9]*)([1-9][0-9]*)$')
171 		if num then
172 			num = tonumber(num)
173 			local subtable = ret[num] or {}
174 			if prefix == '' then
175 				-- Positional parameters match the blank string; put them at the start of the subtable instead.
176 				prefix = 1
177 			end
178 			subtable[prefix] = v
179 			ret[num] = subtable
180 		else
181 			local subtable = ret.other or {}
182 			subtable[k] = v
183 			ret.other = subtable
184 		end
185 	end
186 	if compress then
187 		local other = ret.other
188 		ret = p.compressSparseArray(ret)
189 		ret.other = other
190 	end
191 	return ret
192 end
193 
194 --[[
195 ------------------------------------------------------------------------------------
196 -- compressSparseArray
197 --
198 -- This takes an array with one or more nil values, and removes the nil values
199 -- while preserving the order, so that the array can be safely traversed with
200 -- ipairs.
201 ------------------------------------------------------------------------------------
202 --]]
203 function p.compressSparseArray(t)
204 	checkType('compressSparseArray', 1, t, 'table')
205 	local ret = {}
206 	local nums = p.numKeys(t)
207 	for _, num in ipairs(nums) do
208 		ret[#ret + 1] = t[num]
209 	end
210 	return ret
211 end
212 
213 --[[
214 ------------------------------------------------------------------------------------
215 -- sparseIpairs
216 --
217 -- This is an iterator for sparse arrays. It can be used like ipairs, but can
218 -- handle nil values.
219 ------------------------------------------------------------------------------------
220 --]]
221 function p.sparseIpairs(t)
222 	checkType('sparseIpairs', 1, t, 'table')
223 	local nums = p.numKeys(t)
224 	local i = 0
225 	local lim = #nums
226 	return function ()
227 		i = i + 1
228 		if i <= lim then
229 			local key = nums[i]
230 			return key, t[key]
231 		else
232 			return nil, nil
233 		end
234 	end
235 end
236 
237 --[[
238 ------------------------------------------------------------------------------------
239 -- size
240 --
241 -- This returns the size of a key/value pair table. It will also work on arrays,
242 -- but for arrays it is more efficient to use the # operator.
243 ------------------------------------------------------------------------------------
244 --]]
245 
246 function p.size(t)
247 	checkType('size', 1, t, 'table')
248 	local i = 0
249 	for k in pairs(t) do
250 		i = i + 1
251 	end
252 	return i
253 end
254 
255 
256 local function defaultKeySort(item1, item2)
257 	-- "number" < "string", so numbers will be sorted before strings.
258 	local type1, type2 = type(item1), type(item2)
259 	if type1 ~= type2 then
260 		return type1 < type2
261 	else -- This will fail with table, boolean, function.
262 		return item1 < item2
263 	end
264 end
265 
266 --[[
267 	Returns a list of the keys in a table, sorted using either a default
268 	comparison function or a custom keySort function.
269 ]]
270 function p.keysToList(t, keySort, checked)
271 	if not checked then
272 		checkType('keysToList', 1, t, 'table')
273 		checkTypeMulti('keysToList', 2, keySort, { 'function', 'boolean', 'nil' })
274 	end
275 	
276 	local list = {}
277 	local index = 1
278 	for key, value in pairs(t) do
279 		list[index] = key
280 		index = index + 1
281 	end
282 	
283 	if keySort ~= false then
284 		keySort = type(keySort) == 'function' and keySort or defaultKeySort
285 		
286 		table.sort(list, keySort)
287 	end
288 	
289 	return list
290 end
291 
292 --[[
293 	Iterates through a table, with the keys sorted using the keysToList function.
294 	If there are only numerical keys, sparseIpairs is probably more efficient.
295 ]]
296 function p.sortedPairs(t, keySort)
297 	checkType('sortedPairs', 1, t, 'table')
298 	checkType('sortedPairs', 2, keySort, 'function', true)
299 	
300 	local list = p.keysToList(t, keySort, true)
301 	
302 	local i = 0
303 	return function()
304 		i = i + 1
305 		local key = list[i]
306 		if key ~= nil then
307 			return key, t[key]
308 		else
309 			return nil, nil
310 		end
311 	end
312 end
313 
314 --[[
315 	Returns true if all keys in the table are consecutive integers starting at 1.
316 --]]
317 function p.isArray(t)
318 	checkType("isArray", 1, t, "table")
319 	
320 	local i = 0
321 	for k, v in pairs(t) do
322 		i = i + 1
323 		if t[i] == nil then
324 			return false
325 		end
326 	end
327 	return true
328 end
329 
330 -- { "a", "b", "c" } -> { a = 1, b = 2, c = 3 }
331 function p.invert(array)
332 	checkType("invert", 1, array, "table")
333 	
334 	local map = {}
335 	for i, v in ipairs(array) do
336 		map[v] = i
337 	end
338 	
339 	return map
340 end
341 
342 --[[
343 	{ "a", "b", "c" } -> { ["a"] = true, ["b"] = true, ["c"] = true }
344 --]]
345 function p.listToSet(t)
346 	checkType("listToSet", 1, t, "table")
347 	
348 	local set = {}
349 	for _, item in ipairs(t) do
350 		set[item] = true
351 	end
352 	
353 	return set
354 end
355 
356 --[[
357 	Recursive deep copy function.
358 	Preserves identities of subtables.
359 	
360 ]]
361 local function _deepCopy(orig, includeMetatable, already_seen)
362 	-- Stores copies of tables indexed by the original table.
363 	already_seen = already_seen or {}
364 	
365 	local copy = already_seen[orig]
366 	if copy ~= nil then
367 		return copy
368 	end
369 	
370 	if type(orig) == 'table' then
371 		copy = {}
372 		for orig_key, orig_value in pairs(orig) do
373 			copy[deepcopy(orig_key, includeMetatable, already_seen)] = deepcopy(orig_value, includeMetatable, already_seen)
374 		end
375 		already_seen[orig] = copy
376 		
377 		if includeMetatable then
378 			local mt = getmetatable(orig)
379 			if mt ~= nil then
380 				local mt_copy = deepcopy(mt, includeMetatable, already_seen)
381 				setmetatable(copy, mt_copy)
382 				already_seen[mt] = mt_copy
383 			end
384 		end
385 	else -- number, string, boolean, etc
386 		copy = orig
387 	end
388 	return copy
389 end
390 
391 function p.deepCopy(orig, noMetatable, already_seen)
392 	checkType("deepCopy", 3, already_seen, "table", true)
393 	
394 	return _deepCopy(orig, not noMetatable, already_seen)
395 end
396 
397 --[[
398 	Concatenates all values in the table that are indexed by a number, in order.
399 	sparseConcat{ a, nil, c, d }  =>  "acd"
400 	sparseConcat{ nil, b, c, d }  =>  "bcd"
401 ]]
402 function p.sparseConcat(t, sep, i, j)
403 	local list = {}
404 	
405 	local list_i = 0
406 	for _, v in p.sparseIpairs(t) do
407 		list_i = list_i + 1
408 		list[list_i] = v
409 	end
410 	
411 	return table.concat(list, sep, i, j)
412 end
413 
414 --[[
415 -- Finds the length of an array, or of a quasi-array with keys such
416 -- as "data1", "data2", etc., using an exponential search algorithm. 
417 -- It is similar to the operator #, but may return
418 -- a different value when there are gaps in the array portion of the table.
419 -- Intended to be used on data loaded with mw.loadData. For other tables, use #.
420 -- Note: #frame.args in frame object always be set to 0, regardless of 
421 -- the number of unnamed template parameters, so use this function for
422 -- frame.args.
423 --]]
424 
425 function p.length(t, prefix)
426 	-- requiring module inline so that [[Module:Exponential search]]
427 	-- which is only needed by this one function
428 	-- doesn't get millions of transclusions
429 	local expSearch = require("Module:Exponential search")
430 	checkType('length', 1, t, 'table')
431 	checkType('length', 2, prefix, 'string', true)
432 	return expSearch(function(i)
433 		local key
434 		if prefix then
435 			key = prefix .. tostring(i)
436 		else
437 			key = i
438 		end
439 		return t[key] ~= nil
440 	end) or 0
441 end
442 function p.inArray(arr, valueToFind)
443 	checkType("inArray", 1, arr, "table")
444 	
445 	-- if valueToFind is nil, error?
446 	
447 	for _, v in ipairs(arr) do
448 		if v == valueToFind then
449 			return true
450 		end
451 	end
452 	
453 	return false
454 end
455 
456 return p