Technotes

Technotes for future me

Split a file in blocks and do something with each block

This article shows you how to split a file into blocks (multiline) and do something with each block. This can be used for certificate chains or other files which have multiline blocks.

Splitting a file in blocks

Most actions on text based files like csv files allow you to split a line into multiple parts seperated by, for example, a comma. If you have a file which has blocks spanning multiple lines, it’s harder to find a good guide how to do something with that. This guide tries to be as clear as possible.

The file example I’m using is a certificate chain file. You’ve probably seen one of those, it has a few certificates in it, truncated example:

-----BEGIN CERTIFICATE-----
MIIGO[...]grxckatBjE6CayMDaIHXKgI8oNu/snqhxGfcnrQyS6iu1libVL9VdsNFgCUyBcgl
dtlM9GjvT6JtMVYW
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
MIIE+zCCA+OgAwIBA[...]hS+1NGClXwmgmkMd1L8tRNaN2v11y18WoA5hwnA9Ng==
-----END CERTIFICATE-----

This file has clear seperate “entities”, namely the certificates. Those entities are seperated by the lines -----BEGIN CERTIFICATE----- and -----END CERTIFICATE-----. When I’m referring to a block in this guide, I’m talking about that one part. In this example that’s one certificate.

It could also be that you have a long list file where entities are seperated between blocks, for example a combined .csv file you want to split up.

If you need a certificate chain file, fire up your favorite search engine and search for one.

My file is named test.crt, replace it in the commands with your filename.

The below command splits up the file into blocks and prints the first two lines of a block to the shell:

OLDIFS=$IFS; IFS=';' blocks=$(sed -n '/-----BEGIN /,/-----END/ {/-----BEGIN / s/^/\;/; p}'  test.crt);
for block in ${blocks#;}; do
    echo $block | head -n 2
    echo "==== SEPERATOR ===="
done; IFS=$OLDIFS

Example output:

-----BEGIN CERTIFICATE-----
MIIGODCCBSCgAwIBAgIQD8j++0QhE3owwBaJFRNMGzANBgkqhkiG9w0BAQsFADBk
==== SEPERATOR ====
-----BEGIN CERTIFICATE-----
MIIE+zCCA+OgAwIBAgIQCHC8xa8/25Wakctq7u/kZTANBgkqhkiG9w0BAQsFADBl
==== SEPERATOR ====
-----BEGIN CERTIFICATE-----
MIIDtzCCAp+gAwIBAgIQDOfg5RfYRv6P5WD8G/AwOTANBgkqhkiG9w0BAQUFADBl
==== SEPERATOR ====

Here’s a short explanation of what the loop does:

  • First it saves the old IFS, the internal field seperator. The IFS refers to a variable which defines the character or characters used to separate a pattern into tokens for some operations, mostly a newline, space or tab.
  • Second, the IFS is set to a semicolon (;)
  • The file is split up into blocks using sed, and between each block a semicolon is inserted.
  • Using bash syntax, the blocks are put into a variable in a foreach loop, split on the semicolon
  • Inside the loop an action is done
  • The IFS is restored to what it was.

If your file has semicolons in it, you must change the seperator in the commands. It could be a colon, or another unused character.

If your file is seperated by another text, the format for the split is as follows:

sed -n '/TOP SPLIT LINE/,/BOTTOM SPLIT LINE/ {/TOP SPLIT LINE/ s/^/\;/; p

The top split line must be there twice.

Doing something with the seperate blocks

Printing out the contents of the blocks isn’t super usefull, you could have just done that with the entire file. Maybe you want to split the file up into seperate files. Let’s add a counter and split the chain up into seperate files:

COUNTER=1; OLDIFS=$IFS; IFS=';' blocks=$(sed -n '/-----BEGIN /,/-----END/ {/-----BEGIN / s/^/\;/; p}'  test.crt);
for block in ${blocks#;}; do
    echo "file $COUNTER"
    echo $block > cert-$COUNTER.crt;
    COUNTER=$((COUNTER +1))
done; IFS=$OLDIFS

You now have three seperate files:

$ ls cert*
-rw-rw-r-- 1 blaat blaat 2.2K Sep  2 11:44 cert-1.crt
-rw-rw-r-- 1 blaat blaat 1.8K Sep  2 11:44 cert-2.crt
-rw-rw-r-- 1 blaat blaat 1.4K Sep  2 11:44 cert-3.crt

Or maybe you want to print the common name of each certificate:

OLDIFS=$IFS; IFS=';' blocks=$(sed -n '/-----BEGIN /,/-----END/ {/-----BEGIN / s/^/\;/; p}'  test.crt);
for block in ${blocks#;}; do
    echo $block | openssl x509 -noout -subject -in -
done; IFS=$OLDIFS

Example output:

subject=C = NL, L = Den Haag, O = Koninklijke Bibliotheek, OU = ICT, CN = www.bibliotheek.nl
subject=C = NL, ST = Noord-Holland, L = Amsterdam, O = TERENA, CN = TERENA SSL CA 3
subject=C = US, O = DigiCert Inc, OU = www.digicert.com, CN = DigiCert Assured ID Root CA

Kubernetes get certificate info

#!/bin/bash
kubectl get secrets blaataap.com -n platform -o json | jq '.data["tls.crt"]' |sed 's/"//g' | base64 -d  > output.crt

COUNTER=1; OLDIFS=$IFS; IFS=';' blocks=$(sed -n '/-----BEGIN /,/-----END/ {/-----BEGIN / s/^/\;/; p}'  output.crt);

for block in ${blocks#;}; do
    echo "certificate $COUNTER"
    # echo $block | openssl x509 -noout -subject -issuer -startdate -enddate -in - # OpenSSL 1
    echo $block | openssl x509 -noout -subject -issuer -startdate -enddate - # OpenSSL 3
    COUNTER=$((COUNTER +1))
done; IFS=$OLDIFS

Source:
https://raymii.org/s/tutorials/Bash_bits_split_a_file_in_blocks_and_do_something_with_each_block.html

Last updated on 11 Dec 2023
Published on 31 Oct 2023
Edit on GitHub