regex to match HTML tag starting with a lowercase letter - Stack Overflow-软件玩家

admin管理员组
文章数量:1435859

I am editing an epub file in Sigil and would like to match HTML  tags when the 1st char after the closing tag > is in lower case. I saw some answers on this site to match p tags with attributes, but not a p tag without attributes. I don't know how regular expressions work, so I'm trying to figure out what change I need to do to match both?

Examples:

<p class="calibre1">All the while I was...</p>
<p>All the while I was...</p>
<p class="calibre1">all the while I was...</p>
<p>all the while I was...</p>

The regex should match the last 2 tags in the example above.

The code that I have (/<\/?([^p](\s.+?)?|..+?)>[a-z]/) matches only the 3rd, not the 4th tag.

Important: Sigil has no HTML parser, so I have to stick to using the simple search engine which accepts regular expressions.

Examples:

<p class="calibre1">All the while I was...</p>
<p>All the while I was...</p>
<p class="calibre1">all the while I was...</p>
<p>all the while I was...</p>

The regex should match the last 2 tags in the example above.

The code that I have (/<\/?([^p](\s.+?)?|..+?)>[a-z]/) matches only the 3rd, not the 4th tag.

Important: Sigil has no HTML parser, so I have to stick to using the simple search engine which accepts regular expressions.

Share Improve this question edited Nov 18, 2024 at 8:20 Patrick Janser 4,3271 gold badge19 silver badges22 bronze badges asked Nov 15, 2024 at 19:38 Michael 1175 bronze badges

3 Don't use regular expressions to process HTML, use an HTML parser. – Barmar Commented Nov 15, 2024 at 20:12
2 It's a Thing That Should Not Be. – zer00ne Commented Nov 15, 2024 at 20:21
2 Don't try to run a regex directly on the HTML, but use the DOM to first get the paragraphs with document.querySelectorAll('p'). Then on each of them, look at the innerText property and test it against /^\p{Ll}/u, to see if it starts with a lowercase letter in any language (using the Unicode flag). The reasons are multiple: A. HTML entities: é is é and is a lowercase letter. B. Your paragraph could start with an inner tag like shit starts with a lowercase.... C. Spaces, tabs, new lines, HTML comments before the first letter. – Patrick Janser Commented Nov 15, 2024 at 23:49
sorry, I fot to mention that I'm editing an epub file using Sigil, which has only regex and no HTML parser. – Michael Commented Nov 16, 2024 at 9:18
In Sigil search for <p([^>]*)>((?:\s*||<\s*\w+[^>]*>)*)(\p{Ll}) with the regex options "Dot All" and "Unicode Property" and replace by <p\1>\2\U\3\E to directly convert the lowercase letter to its uppercase version. \U will uppercase the capturing group n°3, which is the lowercase letter. \E stops the uppercase modifier. – Patrick Janser Commented Nov 18, 2024 at 11:46

Add a comment |

1 Answer 1

Sorted by: Reset to default 1

The following regex looks like a good starting place:

<p[^>]*?>[a-z]

From there I'm not sure what you want to capture, but it'll work. And yes, of course you should you an HTMLParser for this, but for something as simple as this I don't see why regex is an issue (provided you know the input, it won't work on a generalized html input).

本文标签： regex to match HTML ltp gt tag starting with a lowercase letterStack Overflow

版权声明：本文标题：regex to match HTML tag starting with a lowercase letter - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1745673868a2669731.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

regex to match HTML <p ...> tag starting with a lowercase letter - Stack Overflow

1 Answer 1

更多相关文章

regex to match HTML <p ...> tag starting with a lowercase letter - Stack Overflow

发表评论

推荐文章

Highcharts drilldown column to pie, axis label issue - Stack Overflow

javascript - Does jQuery support reading JSON from X-JSON HTTP headers? - Stack Overflow

javascript - How to append new elements to the same key in a hash - Stack Overflow

javascript - ES6 in Chrome - Babel Sourcemaps and Arrow Functions lexical scope - Stack Overflow

if then statement javascript - Stack Overflow

热门文章

javascript - Cloud tags sorting and positioning in a fixed-width div - Stack Overflow

html - Client PNG compression using javascript like pngcrush? - Stack Overflow

C# AWS Lambda Annotation Functions never passes through middleware - Stack Overflow

Submit page using dynamic action javascript in Oracle apex - Stack Overflow

Super Tricky Javascript quiz, need to figure out about the answer - Stack Overflow

javascript - Handling global varibales when using AMD Requirejs + Backbonejs - Stack Overflow

javascript - Rest API, where to put heavy computation in routes? - Stack Overflow

javascript - Browserify & ReactJS issue - Stack Overflow

python - Bars in bar charts aligned differently based on axis direction - Stack Overflow

javascript - Next.js 13 Error: Byte Index Out of Bounds on 'npm run dev' - Stack Overflow

最新文章

windows设置断电重启开机后自动输入锁屏密码登录

Windows系统设置开机默认开启数字小键盘

Windows11 开机自动同步时间（开机时间不更新问题）

windows配置开机自启动软件或脚本

【Redis】Windows设置Redis为开机自启动

javascript - Type 'undefined' is not assignable to type 'menuItemProps[]' - Stack Overflow

javascript - VS 2015 Angular 2 import modules cannot be resolved - Stack Overflow

javascript - Get the JSON objects that are not present in another array - Stack Overflow

javascript - How to dismiss a phonegap notification programmatically - Stack Overflow

c - Solaris 10 make Error code 1 Fatal Error when trying to build python 2.7.16 - Stack Overflow

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

编程频道|软件玩家 - 软件改变生活！

regex to match HTML &lt;p ...&gt; tag starting with a lowercase letter - Stack Overflow

1 Answer 1

更多相关文章

regex to match HTML &lt;p ...&gt; tag starting with a lowercase letter - Stack Overflow

发表评论

推荐文章

Highcharts drilldown column to pie, axis label issue - Stack Overflow

javascript - Does jQuery support reading JSON from X-JSON HTTP headers? - Stack Overflow

javascript - How to append new elements to the same key in a hash - Stack Overflow

javascript - ES6 in Chrome - Babel Sourcemaps and Arrow Functions lexical scope - Stack Overflow

if then statement javascript - Stack Overflow

热门文章

javascript - Cloud tags sorting and positioning in a fixed-width div - Stack Overflow

html - Client PNG compression using javascript like pngcrush? - Stack Overflow

C# AWS Lambda Annotation Functions never passes through middleware - Stack Overflow

Submit page using dynamic action javascript in Oracle apex - Stack Overflow

Super Tricky Javascript quiz, need to figure out about the answer - Stack Overflow

javascript - Handling global varibales when using AMD Requirejs + Backbonejs - Stack Overflow

javascript - Rest API, where to put heavy computation in routes? - Stack Overflow

javascript - Browserify &amp; ReactJS issue - Stack Overflow

python - Bars in bar charts aligned differently based on axis direction - Stack Overflow

javascript - Next.js 13 Error: Byte Index Out of Bounds on &#39;npm run dev&#39; - Stack Overflow

最新文章

windows设置断电重启开机后自动输入锁屏密码登录

Windows系统设置开机默认开启数字小键盘

Windows11 开机自动同步时间（开机时间不更新问题）

windows配置开机自启动软件或脚本

【Redis】Windows设置Redis为开机自启动

javascript - Type &#39;undefined&#39; is not assignable to type &#39;menuItemProps[]&#39; - Stack Overflow

javascript - VS 2015 Angular 2 import modules cannot be resolved - Stack Overflow

javascript - Get the JSON objects that are not present in another array - Stack Overflow

javascript - How to dismiss a phonegap notification programmatically - Stack Overflow

c - Solaris 10 make Error code 1 Fatal Error when trying to build python 2.7.16 - Stack Overflow

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价

regex to match HTML <p ...> tag starting with a lowercase letter - Stack Overflow

regex to match HTML <p ...> tag starting with a lowercase letter - Stack Overflow

javascript - Browserify & ReactJS issue - Stack Overflow

javascript - Next.js 13 Error: Byte Index Out of Bounds on 'npm run dev' - Stack Overflow

javascript - Type 'undefined' is not assignable to type 'menuItemProps[]' - Stack Overflow