YAML Ain’t Markup Language. From official documentation, YAML is a data serialization standard for all programming languages. Let’s break it down to understand what does MarkUp and data serialization standard really means and the relevance of YAML for network engineers.
Q1:- What is a Markup Language?
Ans:- A Markup Language is not really a language but rather a standard or a set of instructions that give some meaning to the text in the document. For example, HTML, the most common Mark Language that standards for HyperText Markup Language where the text in the document is given more meaning by the use of predefined tags.
Example of <b>bold</b> in HTML.
Tag <b> clearly determines that which part of the text is supposed to be bold and that is how markings/tags are giving additional meaning to the text. Another similar example is XML which stands for Extensible MarkUp language. The concept is the same, the tags differentiate the text from the meaning of the text. An <IPAddress> tag notifies that the text until the close of the </IPAddress> tag is an IP address.
<?xml version="1.0"?>
<Response MajorVersion="1" MinorVersion="0">
<Get ItemNotFoundBelow="true">
<Configuration>
<InterfaceConfigurationTable MajorVersion="4" MinorVersion="2">
<InterfaceConfiguration>
<Naming>
<Active>act</Active>
<InterfaceName>Loopback0</InterfaceName>
</Naming>
<Description>desc-loop0</Description>
<Shutdown NotFound="true"/>
<IPV4Network MajorVersion="5" MinorVersion="1">
<Addresses>
<Primary>
<IPAddress>1.1.1.1</IPAddress>
<Netmask>255.255.0.0</Netmask>
</Primary>
</Addresses>
</IPV4Network>
</InterfaceConfiguration>
</InterfaceConfigurationTable>
</Configuration>
</Get>
<ResultSummary ErrorCount="0" ItemNotFoundBelow="true"/>
</Response>
Q2:- What is data serialization?
Ans:- It is the process of giving structure to the source data before it’s transmitted to the destination where it is deserialized to reconstruct the clone of source data.
In this evolving world of network automation and orchestration, multiple systems interact with source data which in our case is most of the time the network equipment. While all these systems might have varying internal architectures and the way they process and deal with data but while talking to each other they must follow the same set of standards and data serialization techniques.
While there may be more data serialization standards, the most prevalent for network engineers/automation is below.
- JSON
- YAML
- XML
The choice of data serialization technique largely depends on the complexity of the data, speed, and the need for human readability. While YAML in my opinion is most human-readable as it doesn’t need any brackets or braces but XML probably is most secure and reliable when idempotency is the key and the data is complex to deal with like NETCONF.
JSON on other hand is ideal when dealing with APIs for it allows more readability than XML but still has more structure than a YAML and hence less prone to errors, unlike YAML where indentation is the key.
Let’s talk about basic YAML syntax:-
- Yaml files can have .yaml or .yml extension.
- — Three dashes at the top of yaml document signifies the start of the document. While this is not a mandatory requirement for a document to be considered yaml format. Three dashes are only relevant when the yaml file is a multi document file. Let’s take an example to make this poit more clear.
In either of the above cases, python is able to read both the files exactly same. The — descriptor only becomes relevant when we are dealing with multi document file. For example:-
If you now try to read this file with python, you get this output.
Unlike earlier, where everything after ‘—‘ was considered as single entity in this case ‘—‘ serves as a page break which internally marks as end of one document and start of other and hence you don’t see both hosts as separate dictionaries rather than as a list of dictionaries
- … Three dots signify the end of the file. This is again not a mandate, this might have been relevant in the older drafts of YAML but this is not relevant even if yaml file is multi document. All three examples below behave exactly same.
- Indentation is extremely important and same level of indentation represents the scope of the block
- A single dash ‘ – ‘ represents individual block or in other words each item of a list.
- A colon ‘ : ‘ represents key value pair or a dictionary.
- Values can be specified in single qoutes, double qoutes or without quoutes. However if there is a \n in the value, you need to use qoutes in that/such cases.
- The representation of list in yaml for intfs key in the top and bottom hostname are both valid. For example:-
In either case, the output of both hosts is same structure.
Another EXAMPLE
Python Code:-
# Single file single document.
import yaml
from pprint import pprint
with open('yaml_tut.yml') as f:
data = yaml.load(f, Loader=yaml.FullLoader)
pprint(data)
# Single file multi documents.
# can also be used to open single document yaml files too.
import yaml
from pprint import pprint
with open('yaml_tut.yaml') as f:
data = yaml.load_all(f, Loader=yaml.FullLoader)
# returns a generator that you can iterate over to fetch each document of yaml file.
print(list(data))
# for value in data:
# print(value)