Assume you want to extract below details from the output of “show version” from multiple devices at once. This post is going to talk about how to parse data using python
Using this method you can extract anything you want from any other output too. For instance, you can extract all BGP neighbour info from the output of “show ip bgp summary”. The primary skill required to understand how to parse data using python is basic understanding of regular expressions and ofcourse little bit of python.
You could use a site like regex101 to test your regular expressions.
Sample output of show version from cisco device
rtr-012-ce01#show version Cisco IOS XE Software, Version 03.16.05.S - Extended Support Release Cisco IOS Software, ISR Software (X86_64_LINUX_IOSD-UNIVERSALK9-M), Version 15.5(3)S5, RELEASE SOFTWARE (fc2) Technical Support: http://www.cisco.com/techsupport Copyright (c) 1986-2017 by Cisco Systems, Inc. Compiled Thu 19-Jan-17 09:28 by mcpre Lets see how we can parse data using python
Use Case#1 — Extract Hostname
We know that hostname is found at start of line followed by a # and word “show”. To cater to this extract we could use a regular expression like “^(\S+)#show”. May be in some other devices the hostname may be followed by a “>” or any other symbol. You can modify the regex accordingly.
Lets break this regular expression
- ^ asserts position at the start of a line.
- ( ) represents a capturing group, basically the match that you are interested in.
- \S represents any non-whitespace character
- + matches the previous token between one and unlimited times, as many times as needed.
- #show represents the actual character match
import os
import re
for file in os.listdir('input'):
with open('input/' + file, 'r') as f:
data = f.read()
print(re.findall('^(\S+)#show', data, re.M))
╰─ python3 script1.py ─╯
['rtr-012-ce01']
['rtr-012-ce02']
['rtr-039-ce02']
['rtr-039-ce01']
['rtr-017-ce01']
Since there is only 1 item in the returned list. We can use list indexing to fetch the first item from list.
print(re.findall('^(\S+)#show', data, re.M)[0])
╰─ python3 script1.py ─╯
rtr-012-ce01
rtr-012-ce02
rtr-039-ce02
rtr-039-ce01
rtr-017-ce01
Explanation:-
We are importing python’s re module and using findall method. We are telling python to find all matches that match that regular expression pattern we have specified in the data variable using re.M flag. re.M or re.MULTILINE tells python to understand ^ as start of line and $ as end of line.
re.findall returns a list of matches, instead of using re.findall, you could have also used re.search, re.match, i just prefer to use re.findall. For a single deterministic match, re.search/re.match is probably better but out of personal experience, re.findall is better in the long run when you are dealing with complex matches where you want to return multiple matches from the same output. For example, you might want to return all interfaces from the output of “show ip int br | ex unass” or all bgp neighbor IPs from “show ip bgp summary”.
This is just an introduction on how to parse data using python of command outputs. In other blog posts i will combine all these concepts into a real world use case.
1 thought on “How to parse data using python”