Module talk:Excerpt

Module:Excerpt is permanently protected from editing because it is a heavily used or highly visible module. Substantial changes should first be proposed and discussed here on this page. If the proposal is uncontroversial or has been discussed and is supported by consensus, editors may use {{edit template-protected}} to notify an administrator or template editor to make the requested edit.

This is the talk page for discussing improvements to the Excerpt module.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Archives: 1, 2, 3, 4: 160 days

Portals

	This page is within the scope of WikiProject Portals, a collaborative effort to improve portals on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.PortalsWikipedia:WikiProject PortalsTemplate:WikiProject PortalsPortals articles
Template	This module does not require a rating on the project's quality scale.
	See also: List of Portals

To help centralise discussions and keep related topics together, the talk pages for all the transclude excerpt templates redirect here (as of 15 May 2020 UTC):

New tool and grant ideas

Hi guys! Tonight I had an idea for a new tool, called ExcerptHunter (inspired in CitationHunt). It's basically a semi-automatic tool for doing Template:Excerpt#Replacing summary section with excerpt of child article. I wrote a small demo to help explain. First add the following to your common.js:

mw.loader.load('//en.wikipedia.org/wiki/User:Sophivorus/ExcerptHunter.js?action=raw&ctype=text/javascript');

Then visit User:Sophivorus/ExcerptHunter and you should see the interface. Note that clicking Publish doesn't work yet, but I think the interface already conveys the idea. What do you think? The tool could grow in many ways. For example, by allowing users to limit articles to a category or topic of interest, by showing a live preview next to the wikitext, by working in other wikis, etc.

However, this new tool idea, along with some bugs and feature requests that have been piling up, and other ideas I have in mind (such as generalizing Module:Transcluder into a regex-based Module:WikitextParser) all add up to more than I'm able to handle in my volunteer time.

Therefore, I'm thinking on requesting a Rapid Grant to help me develop ExcerptHunter, WikitextParser, as well as any ideas you come up with and generally catching up and giving a boost to everything excerpt-related. What do you think? Would you support such a grant? Would you like more details, or request some specific work to be done? Looking forward to your reply! Kind regards, Sophivorus (talk) 04:28, 3 December 2023 (UTC)[reply]

@Certes @Aidan9382 I should also mention that it would be a pleasure and an honor to to present a shared grant with you, in case you're interested!!! Sophivorus (talk) 13:40, 4 December 2023 (UTC)[reply]

Please please don't do this in a widespread or semiautomated way, or add tools that make it easier for others to do. The excerpt template is one of the most harmful (reader- and especially author-hostile) changes to Wikipedia in recent years, and its significant proliferation will do dramatic damage to the encyclopedia project. –jacobolus (t) 19:18, 10 December 2023 (UTC)[reply]

Well, I guess the silence and concern expressed imply my proposal wouldn't be welcome. Oh well... Sophivorus (talk) 16:54, 18 December 2023 (UTC)[reply]

I completely agree with @Jacobolus's comment above. It usually does more harm than good. Clayoquot (talk | contribs) 18:44, 13 May 2024 (UTC)[reply]

I see markup errors when I try to include excerpts of pages that use this template. Should it be added to the list of excluded templates? Jarble (talk) 04:43, 7 February 2024 (UTC)[reply]

Could you give an example of a page with that template which has issues? Aidan9382 _(talk) 06:59, 7 February 2024 (UTC)[reply]

@Jarble Hi! I did a simple excerpt of a page using Template:Infobox event in my sandbox and I see no problem. Can you help us reproduce the issue? Sophivorus (talk) 14:14, 7 February 2024 (UTC)[reply]

@Sophivorus and Aidan9382: The images displayed inside the infobox should not appear in the excerpt, but one of the images appeared here: how did this happen? Jarble (talk) 17:24, 7 February 2024 (UTC)[reply]

@Jarble Ah! That is expected behavior, desirable for most excerpts. If you don't want the image, you can set files=0. See what I did in your sandbox, cheers! Sophivorus (talk) 18:01, 7 February 2024 (UTC)[reply]

getTags

@Certes @Aidan9382 Hi! Today I added a new getTags method to Module:Transcluder/sandbox. The regexes are still rather simple and probably fail in many edge cases, but once it's more robust it can help us get things like galleries, blockquotes, divs, etc. Furthermore, it could be used in other methods to extract stuff like <noinclude> tags and perhaps even <ref> tags. One thing that it should handle though are self-closing tags such as <references /> and <ref name="foo" />. I hope you find this idea promising! Sophivorus (talk) 14:23, 7 February 2024 (UTC)[reply]

That looks interesting but does present challenges with Lua's limited regexp syntax. Beware of nested tags, e.g. <noinclude>The world is flat<ref>Anne Idiot</ref></noinclude>: because Lua has no equivalent of \1, a naive regexp may assume that </ref> terminates the noinclude tag here. (Someone could do the world a huge favour by implementing PCRE in Lua, but I suspect it would run for more than ten seconds.) Certes (talk) 14:56, 7 February 2024 (UTC)[reply]

Glad you liked it! Today I did several improvements. The method is now able to handle self-closing tags and nested tags (as long as they're of different types, but afaik nested tags of the same type are not allowed). Another edge case I didn't quite cover are <section> tags, since both opening and closing section tags are self-closing tags. But I think we can continue handling them differently in the getSection method, since they are such a special case. Next time (next week) I'd like to expand the test cases and maybe start using getTags in other methods, such as getReferences. Ideas and concerns welcome, cheers! Sophivorus (talk) 14:24, 8 February 2024 (UTC)[reply]

Hm, I just realized that HTML tags such as divs and spans can be nested and of the same type, so I'll have to refine the method further. No problem, I just hope it doesn't become slow. Sophivorus (talk) 14:53, 8 February 2024 (UTC)[reply]

That looks a lot more robust. A couple more things to watch are spaces within the tag (you've caught some of them) and parameters such as <ref name="Foo">, which is closed by just </ref>. <td> can also be a pain because the closing tag is optional; it can be closed by a second td which might look nested to a naive parser. Certes (talk) 15:33, 8 February 2024 (UTC)[reply]

@Certes Today I coded a first version of getTags that supports nested tags of the same type (things like <div>foo<div>bar</div></div>). It wasn't easy and I just got it to work, so I didn't really test it much. Next time I'll add many more test cases and fix as needed. As to things like unclosed <td> tags, in such cases my heart leans towards fixing the wikitext rather than supporting them. Sophivorus (talk) 15:44, 15 February 2024 (UTC)[reply]

Have you looked around the internet for Lua HTML parsers? I don't see anything specifically for tags but there are plenty of general HTML parsers written in Lua, and some may have licences suitable for re-use here. Certes (talk) 16:26, 15 February 2024 (UTC)[reply]

Hi! I confess no, I haven't looked around. Should probably had, but then again, I enjoyed myself quite a bit while writing the code, and our use case is probably unique enough to warrant custom code anyway. I may be wrong though, but in any case, today I added several new test cases for getTags at Module:Transcluder/testcases and it's looking quite robust, dare I say. Sophivorus (talk) 15:24, 20 February 2024 (UTC)[reply]

WikitextParser

@Certes @Aidan9382 Hi again! As I mentioned before, I'm thinking on generalizing Module:Transcluder into Module:WikitextParser (Transcluder would then require and use WikitextParser). I think such a module would be more useful, easier to maintain and extend, and more likely to attract new developers. Thoughts? Sophivorus (talk) 14:30, 7 February 2024 (UTC)[reply]

That sounds useful but would be a very serious undertaking and might not perform well enough for use during page rendering. https://pypi.org/project/mwparserfromhell/ does something similar for Python and may be worth studying. Certes (talk) 15:00, 7 February 2024 (UTC)[reply]

Does sound like an interesting idea and the modularity would be nice, though I'm curious how involved/complex you intend for it to be. Also, there's already a similarly-named module somewhat related to that idea, Module:Wikitext Parsing, which is mainly to do with helping handle nowiki-like tags if that'd be of any interest. Aidan9382 _(talk) 23:02, 8 February 2024 (UTC)[reply]

It may be worth liaising with a similar development described at Wikipedia talk:Lua/Archive 12#A new template parser. Certes (talk) 16:37, 13 February 2024 (UTC)[reply]

@Aidan9382 @Certes Hi, thanks for the support and links! mwParserFromHell is definitely an inspiration. As to Module:Wikitext Parsing and Wiktionary:Module:template parser, I think they may be useful but I'm not sure how yet. Today I gathered courage and created Module:WikitextParser and Module:WikitextParser/testcases with some code taken from Transcluder. I also started an experiment on good ol' parseFlags method. There's still a long way to go and much may change, but what I currently imagine for this module is a bunch of relatively simple methods to parse wikitext, that other modules may then use and combine as they see fit. I'll try to continue development next week, feel free to contribute if you want! Cheers! Sophivorus (talk) 16:56, 22 February 2024 (UTC)[reply]

Hi again! I did a lot of progress with Module:WikitextParser, so I started testing it with Module:Transcluder/sandbox. The testcases look good so far! Some thoughts:

I'm hesitating whether to move parseFlags (or some version of it) to WikitextParser and add an extra "flags" parameter to all the methods (getTags, getTables, etc). It would certainly make the methods more useful and versatile, but also more complex and difficult to document.
I'm currently testing WikitextParser in Transcluder, but eventually I'd like to use WikitextParser in Module:Excerpt directly, instead of going through Transcluder (for performance reasons). I guess that's another reason to move parseFlags to WikitextParser.
Eventually, Transcluder would be deprecated but kept working for any modules that still use or prefer it.
WikitextParser, unlike Transcluder, doesn't throw errors, but rather nil when something goes wrong.

Kind regards, Sophivorus (talk) 16:08, 29 February 2024 (UTC)[reply]

Should subsections be transcluded without `subsections=yes`?

One of the subsections in this article is transcluded even if

subsections=yes

is not included as a parameter. This only happens when the section heading is in this format:

= History and motivations =

The section appears to be included in this excerpt:

{{excerpt|Computational sustainability}}

Should this section not be transcluded in this case? Jarble (talk) 16:37, 7 March 2024 (UTC)[reply]

Does this occur only when there is a single equals sign in the heading? Certes (talk) 18:56, 7 March 2024 (UTC)[reply]

@Certes: Yes, I've never seen this happen when there is more than one equals sign in the heading. Jarble (talk) 21:47, 7 March 2024 (UTC)[reply]

Per Help:Wikitext#Sections, A single = is styled as the article title and should not be used within an article. Changing to == should fix the problem and potentially fix other problems with the article too. Certes (talk) 23:31, 7 March 2024 (UTC)[reply]

Reference error

I just wanted to notify that there is, currently, a reference error with the excerpt in in this article (ref 153). I thought it was related to the fact that it uses a specific template called "Cite Moulin 2004". So I modified the transcluded reference to use a more generic format, but it doesn't appear to have solved the issue. Alenoach (talk) 02:45, 22 March 2024 (UTC)[reply]

This seems to be an issue with |templates=0 causing the {{Cite book}} inside the reference to be removed, making the reference content empty and causing an error. Aidan9382 _(talk) 07:29, 22 March 2024 (UTC)[reply]

Ok, thanks. I fixed it by whitelisting reference templates with the excerpt parameter "templates=Cite". Alenoach (talk) 01:37, 23 March 2024 (UTC)[reply]

Actually, sometimes you need a more comprehensive whitelist, like e.g. templates=Cite,cite,Citation,rp Alenoach (talk) 03:00, 23 March 2024 (UTC)[reply]

Ref error ruwiki

Hi. I tried to excerpt the lead from ru:Отравление Алексея Навального here - [1] and ref 4 is giving me a reference error. Does anyone know why and how to fix it? Renat 05:53, 22 March 2024 (UTC)[reply]

Excerpt a paragraph, less its bundled citation with an embedded list

Having a bundled citation with an embedded bullet list for several different sources is not unusual. I tried excluding a bundled ref at the end of the first paragraph of 2023 Brazilian Congress attack using |references=no and got a weird result, so added |lists=no on top of that, but still doesn't look right:

excerpt paragraph #1 of 2023 Brazilian Congress attack minus the refs:

{{excerpt|2023 Brazilian Congress attack |paragraphs=1 |hat=no |references=no |lists=no |inline=yes}}

On 8 January 2023, following the defeat of then-president Jair Bolsonaro in the 2022 Brazilian general election and the inauguration of his successor Luiz Inácio Lula da Silva, a mob of Bolsonaro's supporters attacked Brazil's federal government buildings in the capital, Brasília. The mob invaded and caused deliberate damage to the Supreme Federal Court, the National Congress Palace and the Planalto Presidential Palace in the Praça dos Três Poderes (English: Three Powers Square), seeking to violently overthrow the democratically elected president Lula, who had been inaugurated on 1 January. Many rioters said their purpose was to spur military leaders to launch a "military intervention" (related to a misinterpretation of the 142nd article of the Brazilian constitution and an euphemism for a coup d'état) and disrupt the democratic transition of power.<ref>Phillips, Tom (8 January 2023). "Jair Bolsonaro supporters storm Brazil's presidential palace and supreme court". The Guardian. Archived from the original on 8 January 2023.

The final text I want to see in the excerpt is, "...and disrupt the democratic transition of power." I want to keep the |inline=yes so I can tack on my own ref instead of the bundle. Anything I'm missing here? Mathglot (talk) 06:55, 15 April 2024 (UTC)[reply]

Something odd is going on. In my (now undone) rev. 1219016229, I attempted a fix by adding a second test after the first, adding param |templates:-cite=. What happened was that the two tests showed the same result, an improvement over the first attempt, where now there is only a hanging <ref> tag (and no citation content or anything else: just the opening ref tag itself) after the desired text. But the top test in that revision is unchanged from the (only) test in the previous revision (and current revision, after the undo's). So, somehow, the addition of test two in rev. 1219016229 is affecting the result of test 1 in that revision, even though I didn't change that one (afaik). Very odd. Mathglot (talk) 07:15, 15 April 2024 (UTC)[reply]

Also tried:

{{excerpt|2023 Brazilian Congress attack |paragraphs=1 |hat=no |references=no |lists=no |inline=yes|templates=-cite web,cite news}}

but no go. Mathglot (talk) 07:19, 15 April 2024 (UTC)[reply]

Module:Transcluder is getting very confused by this scenario. It appears to be including the list objects from later paragraphs (specifically *{{Cite web |title=Bolsonaro deixa o [...] and *{{Cite web |title=Brazil: Germany [...]), because getParagraphs seems to think the list objects (which in this case are actually the bundled citations) are unrelated to the paragraph, and therefore not removing them along with said paragraph. This also consumes the references' starting ref tag, so it doesn't get removed later on. Thats why, when you try to do {{Excerpt|2023 Brazilian Congress attack|references=no|paragraphs=1}} (so not specificying no lists), you get the 2 bullet points from the next 2 paragraphs leaking out instead. Aidan9382 _(talk) 07:45, 15 April 2024 (UTC)[reply]

Also, |lists=no is probably failing because it then removes the ending ref tag to the starting ref tag (the first reference won't be on a newline so it doesnt get picked up as a list by Transcluder, but then the rest of the bundled citation gets consumed). Aidan9382 _(talk) 07:47, 15 April 2024 (UTC)[reply]

New Template doc section Incompatibilities

I started a new template doc section, § Incompatibilities, to hold a description (or perhaps a bullet list?) of incompatibilities between Excerpt and other templates, modules, or functions. Please add entries to it that you know of. Thanks, Mathglot (talk) 01:27, 13 June 2024 (UTC)[reply]

Module talk:Excerpt

New tool and grant ideas

{{Infobox event}}

getTags

WikitextParser

Should subsections be transcluded without `subsections=yes`?

Reference error

Ref error ruwiki

Excerpt a paragraph, less its bundled citation with an embedded list

New Template doc section Incompatibilities

Navigation menu

Module talk:Excerpt

New tool and grant ideas

{{Infobox event}}

getTags

WikitextParser

Should subsections be transcluded without subsections=yes?

Reference error

Ref error ruwiki

Excerpt a paragraph, less its bundled citation with an embedded list

New Template doc section Incompatibilities

Navigation menu

Search

Should subsections be transcluded without `subsections=yes`?