Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Python by (45.3k points)

I can be given a string in any of these formats:

I would like to extract the host and if present a port. If the port value is not present I would like it to default to 80.

I have tried urlparse, which works fine for the url, but not for the other format. When I use urlparse on hostname: port, for example, it puts the hostname in the scheme rather than netloc.

I would be happy with a solution that uses urlparse and a regex, or a single regex that could handle both formats.

1 Answer

0 votes
by (16.8k points)

By using regex, you can do something like this:

p = '(?:http.*://)?(?P<host>[^:/ ]+).?(?P<port>[0-9]*).*'

m = re.search(p,'http://www.abc.com:123/test')

m.group('host') # 'www.abc.com'

m.group('port') # '123'

Or, without port:

m = re.search(p,'http://www.abc.com/test')

m.group('host') # 'www.abc.com'

m.group('port') # '' i.e. you'll have to treat this as '80'

Related questions

Browse Categories

...