Module talk:TableTools

From WikiProjectMed
Jump to navigation Jump to search

removeDuplicate does not remove duplicate NaN

function p.removeDuplicates(t)
	checkType('removeDuplicates', 1, t, 'table')
	local isNan = p.isNan
	local ret, exists = {}, {}
	for i, v in ipairs(t) do
		if isNan(v) then
			-- NaNs can't be table keys, and they are also unique, so we don't need to check existence.
			ret[#ret + 1] = v
		else
			if not exists[v] then
				ret[#ret + 1] = v
				exists[v] = true
			end
		end	
	end
	return ret
end

This should be:

function p.removeDuplicates(t, uniqueNan)
	checkType('removeDuplicates', 1, t, 'table')
	local ret, isNan, hasNan, exists = {}, p.isNan, false, {}
	for _, v in ipairs(t) do
		-- NaNs can't be table keys in exists[], and they are also equal to each other in Lua.
		if isNan(v) then
			-- But we may want only one Nan in ret[], and there may be multiple Nan's in t[].
			if uniqueNan == nil or uniqueNan == false or uniqueNan == true and not hasNan then
				hasNan = true
				ret[#ret + 1] = v
			end
		else
			if not exists[v] then
				exists[v] = true
				ret[#ret + 1] = v
			end
		end	
	end
	return ret
end

-- verdy_p (talk) 07:50, 2 February 2014 (UTC)[reply]

@Verdy p: This was by design, as comparing two NaNs always results in false. My reasoning was that since two NaNs can never be equal to each other - even if they were made by the exact same calculation - then they shouldn't be treated as duplicates by the algorithm. Although if there's some sort of precedent for doing things a different way, please let me know. I'm fairly new to the world of NaNs, after all. — Mr. Stradivarius ♪ talk ♪ 08:01, 2 February 2014 (UTC)[reply]
That's the Lua interpretation anyway. Even if it has a single Nan value (no distinction between signaling and non-signaling ones, or Nan's carrying an integer type, like in IEEE binary 32-bit float and 64-bit double formats, neither does Java...), there are some apps that depend on using Nan as a distinctive key equal to itself, but still different from nil.
The other kind of usage of Nan is "value not set, ignore it": when computing averages for example, Nan must not be summed and not counted, so all Nan's should be removed from the table. For this case May be there should be an option to either preserve all Nan's, or nuke them all from the result: the kill option would be tested in the if-branch of your first version, and a second alternate option tested after it would be to make Nan's unique in the result.... The first case being quite common for statistics when it means "unset", while nil means something else (such as compute this value before determinig if it's a Nan, nil bring used also for weak references that can be retreived from another slow data store, and the table storing nil being a fast cache of that slow data store) verdy_p (talk) 08:29, 2 February 2014 (UTC)[reply]
I had a quick look at the functions, but cannot work out the intended usage—that usage would probably determine what should happen with NaNs. The docs for removeDuplicates says that keys that are not positive integers are ignored—but what it means is that such keys are removed. On that principle, I think it would be better to default to removing NaNs. I cannot imagine a usage example where I would want to call this function and have a NaN in the result—what would I do with it? If it were desirable to test for the presence of NaN members, why not return an extra value that is true if one or more NaNs are encountered and omitted? How would it help to ever have both 0/0 and -0/0 (or multiple instances of 0/0) in the result?
Regarding verdy_p's code: I don't think it is a good idea to deviate from Lua's idioms, and if there were a uniqueNan parameter, it should not be tested for explicit "false" and "true" values. The function regards "hello" as neither false nor true and while that is a very defensible position, it's not how Lua code is supposed to work. Johnuniq (talk) 03:26, 3 February 2014 (UTC)[reply]
It takes three values (it is a "tri-state boolean"):
  • nil the default is your existing code that keeps them all,
  • true keeps a single Nan,
  • false does not even keep this one
  • any other (unsupported) value in other types is considered like nil.
There's no such "Lua idiom" (about NaN) except that the default value is nil (already a common Lua practive), and otherwise the parameter is used as a boolean flag whose value is meant by its explicit name (and is a general good practive for naming boolean flags)... Tri-state boooleans are very common objects in many situations where we need a supplementary "may be / unknown / don't know / other / not specified / not available / missing data / to be determined later / uninitialized / refuse to say / no opinion / fuzzy" value. This fuzzy concept is also carried by Nan (within the field of numbers instead of booleans) so it is really appropriate here !
There's also no such ideom of "NaN" in Mediawiki parameters. Given that Lua already states that multiple NaN values compare effectively as "equal", it makes no sense to keep these duplicates which occur occasionally as result of computed numeric expressions (numbers in Lua never distinguish any NaN, neither does MediaWiki). You "idiom" is that used in Java or C++, which is NOT directly usd in MadiaWiki or Lua (hidden in the underlying internal representation). Keeping these duplicates should complicates situation where we expect "keys" in Lua tables to be unique. Not doing that means that we'll not properly detect overrides, and it will be impossible to map any action of value to NaN, meaning that we'll always get unspecified results that we can never catch in MediaWiki templates (this means undetected errors and unspecified behavior, wrong results in all cases). Lua treats all NaN as unique values which compare equal between each other, but different from all other non-NaN values (including "nil"/unmapped). Note that "nil" in (which matches unspecified MediaWiki values) is NOT "NaN" and is also NOT an "empty string" value (which is only one of the possible default values for unspecified but cannot match any "NaN"). "NaN" in Mediawiki is a string, and does not match the Lua NaN numeric value.
It's only when we start speaking about fuzzy logic with measured accuracy, that we introduce many other values by representing fuzzy booleans by a real number between 0.0 and 1.0, but then even this accuracy may be sometimes not measurable and will also needs an additional Nan value, so fuzzy booleans have infinite number of values between 0.0 and 1.0 plus Nan; (0.0 representing absolutely false, 0.5 representing false or true equally, and 1.0 representing absolutely true, the rule being that if you sum the response rates, you'll get always 1.0). [User:Verdy p|verdy_p]] (talk)
In summary the statement "-- NaNs can't be table keys, and they are also unique, so we don't need to check existence." is compeltely false: not checking this existence means that the line "ret[#ret + 1] = v" will be executed multiple times (each time "v" is NaN) so "ret[]" will contain multiple "NaN" values... And this is not expected.
Note you may want to have "uniqueNan" set to true by default (the code above keeps the existing default behavior when the optional "uniqueNan" parameter is not specified, i.e. it keeps these duplicates).
If you drop that optional parameter, by forcing the function to behave like if it was always true, all that is needed is to change the line "if uniqueNan == nil or uniqueNan == false or uniqueNan == true and not hasNan then" to "if not hasNan then"... I.e. you just need to check and set "hasNan = true" before adding to "ret[]". verdy_p (talk) 20:40, 5 August 2018 (UTC)[reply]
I don't know what the "correct" behavior for removeDuplicates should be when passed a NaN, either keeping them all or keeping only one. I do know that the following statements are incorrect or irrelevant:
  • there are some apps that depend on using Nan as a distinctive key equal to itself, but still different from nil – Whether such apps exist or not, in Lua neither NaN nor nil can be a key in a table. Only a value.
  • The other kind of usage of Nan is "value not set, ignore it": when computing averages for example [...] while nil means something else – That seems a rather idiosyncratic use, and not relevant to a generic "removeDuplicates" method. Particularly since for calculating averages you wouldn't want to remove duplicates in the first place.
  • For this case May be there should be an option to either preserve all Nan's, or nuke them all from the result – Removing all NaNs would be better done by a "removeNaNs" method rather than a weird flag on "removeDuplicates".
  • Given that Lua already states that multiple NaN values compare effectively as "equal" – That's incorrect. Lua treats all NaN values as unequal.
  • numbers in Lua never distinguish any NaN – I have no idea what you're trying to say here. Lua does have NaNs, for example as the result of calculating "0/0".
  • You "idiom" is that used in Java or C++, which is NOT directly usd in MadiaWiki or Lua (hidden in the underlying internal representation). – MediaWiki and Lua both use IEEE NaNs, although neither does anything special with the various types of NaN (quiet versus signaling and so on). Lua is implemented in C and uses its floating point types, including NaNs, while MediaWiki is written in PHP which is written in C (or C++ when using HHVM) and again uses C's floating point types.
  • Keeping these duplicates should complicates situation where we expect "keys" in Lua tables to be unique. — Since Lua tables cannot have NaN as a key, this seems irrelevant.
  • Lua treats all NaN as unique values which compare equal between each other – Again, this is wrong.
  • Note that "nil" in (which matches unspecified MediaWiki values) is NOT "NaN" and is also NOT an "empty string" value – I don't see where anyone else ever said it was.
  • "NaN" in Mediawiki is a string, and does not match the Lua NaN numeric value. – If you pass NaN as a template parameter, then yes, it will come into Lua as a string and will not match an actual Lua NaN. That doesn't seem at all relevant to this removeDuplicates method though.
  • It's only when we start speaking about fuzzy logic with measured accuracy – You seem to be going of on some completely unrelated tangent here.
  • In summary the statement "-- NaNs can't be table keys, and they are also unique, so we don't need to check existence." is compeltely false – On the contrary, it's largely true. In Lua NaN cannot be a table key. Since nan ~= nan is always true, arguably every NaN is "unique".
HTH. Anomie 23:11, 5 August 2018 (UTC)[reply]

Template-protected edit request on 4 March 2019

Please add function ipairsAt()/ipairsAtOffset() as demonstrated here. Ans (talk) 10:23, 4 March 2019 (UTC)[reply]

@Ans: What is the usecase for this function? {{3x|p}}ery (talk) 15:08, 4 March 2019 (UTC)[reply]
I plan to use in function p.call() in Module:LuaCall --Ans (talk) 10:40, 5 March 2019 (UTC)[reply]
Those are both so simple you could just do it inline. Also, I really hope you're not planning on actually using Module:LuaCall somewhere outside of maybe your user space, per Module talk:LuaCall#I sincerely hope that no one ever actually uses this. Anomie 13:35, 5 March 2019 (UTC)[reply]
It is not so simple, as the user need to understand the mechanism of ipairs() and iterator to do it inline. The proposed function could help users who want to iterate like this, but don't know how to do it inline. Moreover, using ipairsAt() instead of inline will improve code readability. It does not just help developers write code, but also help beginners to understand what the code do when reading the code that use this function rather than inline code.
Module:LuaCall could be used to help write debugging code instantaneously in template.
--Ans (talk) 05:00, 7 March 2019 (UTC)[reply]
  •  Not done Edit requests to fully protected pages should only be used for edits that are either uncontroversial or supported by consensus. As this has been objected to already consider you are already at stage 3 of WP:BRD. — xaosflux Talk 02:51, 8 March 2019 (UTC)[reply]
If after some period (ex. 7 days), no one has responsed or objected to my last reasons, does it be considered to be uncontroversial? --Ans (talk) 04:40, 12 March 2019 (UTC)[reply]

Template-protected edit request on 8 March 2019

Please change function p.length() to reduce loops count by one, as demonstrated here (also add note for #frame.args). Ans (talk) 05:38, 8 March 2019 (UTC)[reply]

 Done -- /Alex/21 13:06, 12 March 2019 (UTC)[reply]

Delete this module

@Hhkohh: Is it a mistake that you added a tag saying this module is going to be deleted? I can't see that from the TfD. Christian75 (talk) 12:37, 13 March 2019 (UTC)[reply]

Christian75, not delete, just being merged per TfD outcome. Hhkohh (talk) 12:46, 13 March 2019 (UTC)[reply]
I thought that, but the template you added says "This template is currently being deleted." Christian75 (talk) 12:49, 13 March 2019 (UTC)[reply]

Help in writing better testcases

Hello developers, I am working with mw:Multilingual Templates and Modules and to convert this module into a shared one, we need better mw:Module:TableTools/testcases. Can anyone please help? @Frietjes:, @RexxS: Capankajsmilyo(Talk | Infobox assistance) 10:32, 22 May 2019 (UTC)[reply]

Already done, sorry to bother. Capankajsmilyo(Talk | Infobox assistance) 10:38, 22 May 2019 (UTC)[reply]

Array length merger

@Pppery: Is there any reason the merged version in the sandbox was never synced with the main template? --Trialpears (talk) 20:48, 31 July 2019 (UTC)[reply]

Lack of confidence in my own ability to edit a module used on 8% of all pages without breaking something (Yes, I know I did edit this module in February). The code should be ready to go live. * Pppery * it has begun... 21:02, 31 July 2019 (UTC)[reply]
That is probably the correct response when dealing with these things... Having a second pair of eyes look at it (which won't be me with no module coding experience yet) sounds sensible. Also sorry for deleting your note tat you had merged it, my bad. --Trialpears (talk) 22:06, 31 July 2019 (UTC)[reply]
@Mr. Stradivarius: Does the merged code in Module:TableTools/sandbox (Special:Diff/899722883/899724164) look good to you? * Pppery * it has begun... 01:43, 23 September 2019 (UTC)[reply]
@Pppery: Sorry for the late reply. The implementation is mostly good, but for empty arrays it should return 0, whereas currently it returns nil. Also, any chance you could add some test cases for it to Module:TableTools/testcases? As for naming, I would go with something different. "Binary length" sounds like it could mean the length in bytes of a table (which I know doesn't really have a meaning for Lua tables, but hey). It also isn't clear that this only works for arrays or array-like tables. Finally and most nit-pickingly, it isn't technically a binary search; exponential searches and binary searches are slightly different. How about "arrayLength" instead? The explanation of which algorithm is used would have to be relegated to the doc page, but I think this might be a clearer approach overall. Best — Mr. Stradivarius ♪ talk ♪ 08:53, 29 September 2019 (UTC)[reply]
@Mr. Stradivarius: OK, I've fixed the 0 versus nil bug, and added some test cases. As to the naming, I don't like arrayLength, because it is not meaningfully distinguishable from the pre-existing length function, and two functions having names that are synonyms feels like bad coding style. * Pppery * it has begun... 20:57, 29 September 2019 (UTC)[reply]
@Pppery: Ah, I had forgotten that there was also a length function. I agree that it is confusing to have both length and arrayLength. How about replacing length with binaryLength? It is a much more efficient algorithm for large arrays, and only marginally less efficient for very small arrays; plus it has the same requirement that no intermediate elements in the array can be nil. If any code is affected by switching from an incremental to an exponential algorithm, that code was probably broken to begin with. — Mr. Stradivarius ♪ talk ♪ 14:25, 30 September 2019 (UTC)[reply]
Also, thank you for the algorithm fix and for adding the test case. I split the test case up into four separate test cases; this should make it easier to pinpoint the error if a future edit causes a problem with the function. — Mr. Stradivarius ♪ talk ♪ 14:27, 30 September 2019 (UTC)[reply]
@Mr. Stradivarius: I didn't go with that originally because Module:Array length has text on its documentation page saying it shouldn't be used for small arrays but yes, that suggestion does seem reasonable. * Pppery * it has begun... 21:13, 30 September 2019 (UTC)[reply]
Replacement of length with binaryLength done. * Pppery * it has begun... 18:46, 5 October 2019 (UTC)[reply]

And  merged with main module. Let's see if anything breaks ... * Pppery * it has begun... 03:41, 21 December 2019 (UTC)[reply]

Potential code simplification

Keeping in mind that I don't actually know much about Lua and could easily be overlooking some obvious problem with this suggestion, the functions isPositiveInteger() and isNan() could be slightly simplified by removing the if structures and instead directly returning on the logic, since it should return true or false, as appropriate, already. I'd demonstrate exactly what I mean in the sandbox, but there are other outstanding changes there and I didn't want to introduce irrelevant changes without approval first. ディノ千?!☎ Dinoguy1000 02:15, 31 October 2019 (UTC)[reply]

@Dinoguy1000: As the user who coded the changes currently in the sandbox, feel free to demonstrate your simplification edit. * Pppery * it has begun... 23:43, 4 November 2019 (UTC)[reply]
Done. Promisingly, all testcases appear to be passing after this, assuming I'm understanding that page correctly. ディノ千?!☎ Dinoguy1000 05:38, 5 November 2019 (UTC)[reply]
@Dinoguy1000: Looks like an improvement to me, which I will include if I stop worrying too much about editing a module with this many transclusions and actually implement the TfD I initiated in February. * Pppery * it has begun... 01:59, 8 November 2019 (UTC)[reply]
@Pppery: I think a similar improvement can be made in the cleanPattern() subroutine of the affixNums() function, though I'm not sure (again, I don't know enough about Lua or Scribunto to know if there might be some gotcha there), in the event you want to look at another thing to bundle into the edit too. ディノ千?!☎ Dinoguy1000 03:56, 8 November 2019 (UTC)[reply]

Protected edit request on 7 March 2020

change -- requiring module inline so that Module:Exponental search to -- requiring module inline so that Module:Exponential search (rationale : typo in Exponental / Exponential) 82.255.235.43 (talk) 03:45, 7 March 2020 (UTC)[reply]

 Not done for now: Please feel free to queue this in the sandbox. This module is used on many pages and making minor typos like that one doesn't make sense as its own edit. Izno (talk) 03:49, 7 March 2020 (UTC)[reply]
I'm pretty late to mention this, but I fixed this and another comment, and made another minor code simplification, in this edit to the sandbox last month. I'll leave it to more regular module editors to decide if that's significant enough to edit the live module now, though. ディノ千?!☎ Dinoguy1000 18:22, 19 May 2020 (UTC)[reply]
This module gets edited so rarely that the approach "wait for a substantive edit" isn't really feasible. * Pppery * it has begun... 21:49, 19 May 2020 (UTC)[reply]
Fair enough, I've gone ahead and deployed the sandbox version. If I misunderstood your intent, feel free to correct me. =) ディノ千?!☎ Dinoguy1000 15:51, 23 May 2020 (UTC)[reply]
You did not misunderstand me; I would have done it myself as soon as this module was no longer used on the Main Page. * Pppery * it has begun... 16:08, 23 May 2020 (UTC)[reply]

Bug in deepCopy

Has anyone tried p.deepCopy which was apparently merged in from Module:Table by Pppery in diff? The function has a bug: in function _deepCopy, the three occurrences of deepcopy should be _deepCopy. If the function has never been used, perhaps it should be deleted? I see the documentation mentions that it has some benefits over mw.clone, but I don't think functions should be included in this massively used module unless they are needed by established modules. Also, there has to be some kind of testcase for each function. Searching shows there are a few modules using their own deepcopy functions, but I can't find any trying to use this one. Please don't fix the bug yet because if the module is to be changed, the proposed changes should be in the sandbox for a week or so to give others a chance to tweak things. There are a couple of comment inelegancies that might be fixed if the module is updated. Johnuniq (talk) 04:02, 2 November 2020 (UTC)[reply]

I have no objections to deleting the functionality. Kind of embarassing that I made a mistake like that when editing a highly-visible module. * Pppery * it has begun... 04:42, 2 November 2020 (UTC)[reply]

reverseNumKeys, reverseSparseIpairs

I have added these functions to the sandbox and test cases. Trigenibinion (talk) 17:32, 22 February 2021 (UTC)[reply]

Shouldn't isArray be more generic?

@Pppery, Johnuniq, and Izno: Currently, isArray raises an error if the input is not a table, shouldn't it just return false, which is what a user would expect from an isXxx function? Furthermore, wouldn't it be better if it checked if the input was array-like? This would allow custom objects/containers to be considered array-like if they are iterable and only have numeric keys. Alexiscoutinho (talk) 14:23, 9 July 2021 (UTC)[reply]

I agree that isArray should return true or false and should never raise an error—that would be much more in keeping with Lua's style of doing the right thing when reasonable. The checkType should be replaced with a line to return false if the input is not a table. BTW, that's an amazing implementation which would never have occurred to me. I see #Bug in deepCopy above has not been addressed yet. Johnuniq (talk) 00:13, 10 July 2021 (UTC)[reply]
@Johnuniq: What do you think of this?
function (obj)
	if not pcall(pairs, obj) then
		return false
	end
	local i = 0
	for _ in pairs(obj) do
		i = i + 1
		if obj[i] == nil then
			return false
		end
	end
	return true
end
Good, but I would stick with:
function p.isArray(t)
	if type(t) ~= 'table' then
		return false
	end
	local i = 0
	for _ in pairs(t) do
		i = i + 1
		if t[i] == nil then
			return false
		end
	end
	return true
end
That's using t for simplicity and consistency with the rest of the module. It would not be possible for pcall(pairs, obj) to be successful unless obj is a table so whereas it is more pure to test for that, it is clearer to admit that a table is what works. I see that the function returns true if given {} (an empty table) and that seems desirable. Johnuniq (talk) 07:16, 10 July 2021 (UTC)[reply]
@Johnuniq: pcall(pairs, obj) would work if obj implements __pairs. Although obj would technically still be a table, it could have a custom type (by overriding type, which is what Module:Lua class does). Do you think it would be better to name my function something different (isArrayLike) then? The original one could have your tweak too. Alexiscoutinho (talk) 15:23, 10 July 2021 (UTC)[reply]
Yikes, I hadn't absorbed that Module:Lua class was seriously implementing classes. The question becomes a matter of what approach would be best in the long term, given that modules are (or will be) maintained by Wikipedians who might struggle when faced with code that has no apparent starting point or path of execution (which is the consequence of fully using classes). Also, consistency is important and isArray would look out-of-place in Module:TableTools if it is the only function with obj as the parameter and which tests for the method it is going to use rather than the variable type. I'm highly pragmatic and would use the naive code I put above, but you have a reason for wanting the proper implementation and that's ok by me. Regarding pragmatism, the heading comment for isArray (with its mention of a table) might be confusing, but a fully correct comment would be three times longer with the new code. Regarding duplicating the function, that's ugly as I'm sure you know. I haven't done anything with this module and I'm not sure why I'm watching here, although I have commented a couple of times. In other words, I'm happy for you to do whatever is wanted. It might be best to try something in the sandbox, preferably also fixing the deepCopy bug I mentioned, then update the module. Johnuniq (talk) 02:21, 11 July 2021 (UTC)[reply]
I personally think that learning to maintain class based modules is just as easy as function based ones. For more complex modules, I think OOP greatly improves readability/understanding for anyone for many reasons, e.g. variables wouldn't have to be carried everywhere in function signatures. Of course, simpler modules might not need classes. It's just a matter of using it where it's appropriate. I agree it wouldn't be symmetrical if isArray received obj, thus I will propose isArrayLike instead. Since these functions are so small, I don't think a small duplication would be bad. The two functions would have their own different use cases too. Alexiscoutinho (talk) 21:04, 11 July 2021 (UTC)[reply]
To stick up for functions, at least you can tell what the inputs are! I'm happy for you to edit as wanted and there doesn't appear to be anyone else with an opinion so please go for it. Johnuniq (talk) 22:32, 11 July 2021 (UTC)[reply]

@Alexiscoutinho: I checked your edit to Module:TableTools/sandbox and it looks good, thanks. You will see that I removed trailing whitespace and removed two unused functions added to the sandbox on 22 February 2021 because I don't think we should add functions without a specified use. After my minor tweaks, along with your comment changes (a very good idea to make them consistent), it is now a nightmare to compare the main module with the sandbox, but I believe the change is good.

Johnuniq (talk) 05:37, 12 July 2021 (UTC)[reply]

Protected edit request on 25 July 2021

This request is to implement all the changes made to the sandbox until this time. The differences can be checked with the diff link of the above section. Summarizing changes: improved module formatting; improved isArray; added isArrayLike; fixed _deepCopy; and improved defaultKeySort. Alexiscoutinho (talk) 14:27, 25 July 2021 (UTC)[reply]

Sorry this has taken so long to get to.  Done, on assurances this has been tested by you and also checked by Johnuniq. — Martin (MSGJ · talk) 10:42, 4 October 2021 (UTC)[reply]

keySort documentation or examples please

Could someone add to documentation, or point to good live examples, of the |keySort= function, as used in § keysToList and § sortedPairs? Related to Lua manual:table.sort? I need a simple non-numerical sort ;-) Thx. -DePiep (talk) 10:19, 20 November 2021 (UTC)[reply]

If you provide a short example of dummy data to be sorted I'll have a look. As a quick example, if keys is a list of values each of which might be a number or a string, the following would sort the table so numbers are sorted before strings.
table.sort(keys, function (a, b)
    if type(a) == type(b) then
        return a < b
    elseif type(a) == 'number' then
        return true
    end
end)
Johnuniq (talk) 23:29, 20 November 2021 (UTC)[reply]

Improved NaN handling

Please pull in my changes from the sandbox that correct issues with NaN handling. I already updated the testcases (I first made them fail and then fixed the issue in the sandbox). I considered further simplifying isNaN since I believe NaNs are the only values in Lua that fail the not equal to self test but it might be possible with external code, e.g., userdata (but I believe Scribunto uses none of that). That said, anything not equal to itself probably should not be considered for removal in removeDuplicates. In Lua, NaNs are always an issue when attempting to use arbitrary table values as table keys (unlike nil which cannot be used as a table value or key and mostly come into play when dealing with sparse tables or arbitrary expression lists). —Uzume (talk) 19:37, 17 December 2021 (UTC)[reply]

@Verdy p, Johnuniq, and Anomie: It should be noted that both nil and NaN values cannot be table keys, however, it is not an error to attempt to index a table with such values. Lua will throw errors preventing these values from being used as actual keys (i.e., rawset(tableX, nil, true) and rawset(tableX, 0/0, true) are guaranteed to error) but these values can be handled in index and newindex events (via __index and __newindex metamethods) without issue. As an example, I initially modified listToSet to consider all NaN values as if they were the same (i.e., if there is at least one NaN in the input array, indexing the returned set with any NaN yeilds true) but decided that was not particularly useful, given Lua considers them never equal to any value (including other NaNs or even themselves). If that is considered valuable we could revert to such. —Uzume (talk) 19:37, 17 December 2021 (UTC)[reply]

Thanks but the sandbox p.listToSet has a typo due to use of v in p.invert and item in p.listToSet. It's probably best to use v in both of them. By the way, the ping did not work because it was added in an edit of the comment. Johnuniq (talk) 00:50, 18 December 2021 (UTC)[reply]
@Johnuniq: the mispaste was easily fixed but I cleaned up the variables across the functions anyway. Thanks. —Uzume (talk) 22:43, 30 December 2021 (UTC)[reply]
I'm not going to think about anything remotely tricky for at least a week. Someone else might want to action this but I'm posting to let you know that if nothing happens you might ping me in ten days. Johnuniq (talk) 23:30, 30 December 2021 (UTC)[reply]
@Uzume and Johnuniq: is this still being worked on, or is there an agreeable version ready to go? — xaosflux Talk 15:29, 10 January 2022 (UTC)[reply]
@Xaosflux: As far as I am concerned the version in the current sandbox is "ready to go". I have rectified the one issue Johnuniq found since I originally made the request. —Uzume (talk) 16:10, 10 January 2022 (UTC)[reply]
I looked at the new code. There is another problem at line 67 due to replacement of t with arr in p.removeDuplicates in two of the three places where t was used. I will have a look at what should be done but introducing arr when the rest of the module uses t is not desirable and I suspect the new arr should go. Johnuniq (talk) 02:47, 11 January 2022 (UTC)[reply]
@Johnuniq and Xaosflux: That too was easily rectified. Please find the fixes in the current sandbox. Thank you, —Uzume (talk) 20:10, 27 January 2022 (UTC)[reply]
That looks good. However now we need to think about whether a module used on 4.9 million pages should be updated with some nice-to-have improvements that are very unlikely to make a practical difference. I'm happy either way but will wait in the hope that someone else makes that decision. Johnuniq (talk) 02:20, 28 January 2022 (UTC)[reply]
@Johnuniq: I find it amusing you consider correcting a bug in the NaN handling "some nice-to-have improvements that are very unlikely to make a practical difference". Such is certainly true if one never has any NaN values anywhere but clearly it has (broken) NaN support so apparently someone needed that. This was the main reason for updating the testcases first (to demonstrate the brokenness and subsequently rectify such). —Uzume (talk) 13:12, 30 January 2022 (UTC)[reply]
I've made the changes. Thanks for working on this and Johnuniq for checking code — Martin (MSGJ · talk) 13:10, 31 January 2022 (UTC)[reply]

Duplicate handling with "invert"

I wonder if it would be useful to have something like invert that supports duplicates by providing values that are arrays of indices from the original input array. —Uzume (talk) 19:37, 17 December 2021 (UTC)[reply]

Currently, p.invert({'a','b','c','a'}) would give {a=4,b=2,c=3}. I think the proposal is that it might give {a={1,4},b=2,c=3}. Perhaps, but this kind of consideration might be best handled after considering a real need at Wikipedia. Another possibility would be to throw an error if a duplicate occurs, possibly controlled by a parameter saying what should happen with duplicates. I don't think we should worry about that unless a need arises. Johnuniq (talk) 01:01, 18 December 2021 (UTC)[reply]
@Johnuniq: I agree about waiting for an actual need but I thought I would bring up this lack of functionality anyway. I was thinking more along the lines of: p.invert{'a','b','c','a'}{a={1,4},b={2},c={3}} but you got the gist of my suggestion/observation.—Uzume (talk) 23:00, 30 December 2021 (UTC)[reply]

p.merge() function

Hello, this module is missing one function that allows to merge multiple tables into one table. It would be nice if someone add it:

function p.merge(...)
	local tables = { ... }
	local result = {}
	for i, t in ipairs(tables) do
		checkType('merge', i, t, 'table')
		
		for k, v in pairs(t) do
			if not result[k] then
				result[k] = v
			end
		end
	end
	
	return result
end

Example: p.merge({1, 2, ["a"] = "b"}, {10, [3] = 3, ["a"] = "a"}, {["b"] = "test"})

Output: {1, 2, 3, ["a"] = "b", ["b"] = "test"}🎭 Антарктидов (AKA Antraktidov) (talk page | contribution) 21:41, 13 December 2023 (UTC)[reply]

deepEquals function

@Pppery You may or may not be interested in a revised version of the deepEquals function currently at wikt:Module:table, which I note this module lacks, which checks for structural equivalence between key/value pairs; arbitrary recursive nesting and tables-as-keys are both supported. It's got a few speed optimisations that make the code a little verbose in places, but parts could easily be simplified if you feel that's not necessary here. Theknightwho (talk) 11:24, 14 April 2024 (UTC)[reply]

Infinite loop in deepCopy

There's a bug in the deepCopy function which manifests with any kind of recursive table nesting. However, it's easy to fix, as you just need to cache the new copy in already_seen before doing the pairs loop, like this:

	if type(orig) == 'table' then
		copy = {}
		already_seen[orig] = copy
		for orig_key, orig_value in pairs(orig) do
			copy[_deepCopy(orig_key, includeMetatable, already_seen)] = _deepCopy(orig_value, includeMetatable, already_seen)
		end

		if includeMetatable then
			local mt = getmetatable(orig)
			if mt ~= nil then
				local mt_copy = _deepCopy(mt, includeMetatable, already_seen)
				setmetatable(copy, mt_copy)
				already_seen[mt] = mt_copy
			end
		end
	else -- number, string, boolean, etc
		copy = orig
	end

Theknightwho (talk) 13:03, 14 April 2024 (UTC)[reply]