Python3 re 模块的问题
import re
s="![](/img/2020pic/02/1.jpg) 以及: ![](/img/2020pic/02/2.png)"
pattern = re.compile('[(](.+?)(:?.png|.jpg)[)]')
result = pattern.findall(s)
for i in result:
print(i)
匹配出来的结果如下
('/img/2020pic/02/1', '.jpg')
('/img/2020pic/02/2', '.png')
请问为什么每个匹配项会被分成一个元组,如果想要独立的抓出 /img/2020pic/02/1.jpg 和另一个 png,应该怎么改呢?
1
ysc3839 2020-10-02 19:07:47 +08:00 via Android
最后一句话是什么意思呢?能否举个例子?
|
2
WaterWestBolus OP @ysc3839 就是说,我想写个正则表达式,从上面那个 s 里面取出```/img/2020pic/02/1.jpg```字段和```/img/2020pic/02/2.png```字段,放在一个 list 里面,预期的 list 应该如下
``` ['/img/2020pic/02/1.jpg','/img/2020pic/02/2.png'] ``` |
3
1462326016 2020-10-02 19:21:28 +08:00
也许可以这样?
``` import re s = "![](/img/2020pic/02/1.jpg) 以及: ![](/img/2020pic/02/2.png)" pattern = re.compile(r'\((.+?[.png|.jpg])\)') result = pattern.findall(s) for i in result: print(i) ``` 排版可能会乱。。。回复好像不支持 markdown |
4
WaterWestBolus OP @1462326016
谢谢你的回复,刚刚尝试了你的代码,在 s = ' (( RubberPencil) p).write("Hello");'的情况下居然能匹配到字符串'( RubberPencil) p',非常费解。。 |
5
ysc3839 2020-10-02 20:32:31 +08:00 via Android
|
6
iNaru 2020-10-02 20:52:06 +08:00
(?<=\().+?\.(?:jpg|png)(?=\))
|
7
AlisaDestiny 2020-10-02 21:38:05 +08:00
In [2]: p = re.compile(r'\((.+?\.(?:jpg|png))\)')
In [3]: p.findall(s) Out[3]: ['/img/2020pic/02/1.jpg', '/img/2020pic/02/2.png'] |
8
WaterWestBolus OP |
9
ysc3839 2020-10-02 23:26:42 +08:00 via Android
|
10
ysc3839 2020-10-02 23:29:28 +08:00 via Android
|
11
JCZ2MkKb5S8ZX9pq 2020-10-02 23:55:51 +08:00
re.compile(r'(?<=\]\().*?\.(?:png|jpg)(?=\))')
我试了下这样可以 |
12
brucmao 2020-10-03 00:51:44 +08:00
|
13
krixaar 2020-10-03 08:35:38 +08:00 1
目的是要一个 list,不一定非得从正则本身下手吧。result = [''.join(i) for i in pattern.findall(s)] 就直接搞定了?
|
14
chaogg 2020-10-04 20:05:40 +08:00
>>> pattern = re.compile(r'\!\[.*?\]\((.+?\.(?:jpg|png))\)')
>>> pattern.findall(s) ['/img/2020pic/02/1.jpg', '/img/2020pic/02/2.png'] |
15
biglazycat 2020-10-24 16:01:09 +08:00
line = "![](/img/2020pic/02/1.jpg) 以及: ![](/img/2020pic/02/2.png)"
pattern = re.compile('\((\S+)\)') result = pattern.findall(line) print(result) |
16
biglazycat 2020-11-23 04:54:37 +08:00
>>> s="![](/img/2020pic/02/1.jpg) 以及: ![](/img/2020pic/02/2.png)"
>>> re.findall(r'[/\w]+\/\w+\.\w+', s) ['/img/2020pic/02/1.jpg', '/img/2020pic/02/2.png'] |