Parsing GPS data with Bash

Last time we looked at how to get the data to the GPS and now we will have a look at how to parse the data. Turns out that except for the check-sum, everything is pretty straight forward, even in Bash.


So, why bash in the first place? Well, there’s not real reason except that for the something else I’m working on, it’s the ideal glue-code language, allowing me to invoke simply other programs that I do not want to re-code (or take parts of) to do what I want. I must say that I even have a C# version of the GPS data grabber, but while fancier, it does not bring much more than the Bash version.

A typical GPS message looks like


Here, we have a series of comma-separated fields, ended by * and a check-sum—let us ignore those for now. The message type, $GPGGA gives the 3d location and accuracy data, containing things like the UTC time of capture, the number of satellites being tracked, the position (here xxxx.xxxx and yyyy.yyyy; let’s keep some privacy).

Splitting the message into fields is trivial in Bash (as it would be in C# where it would suffice to use string.Split(...) to get essentially the same result):

IFS=, # set the Internal Field Separator

# casts the string $message as a
# list, each item separated by $IFS
message_fields=( "$message" )


It is now possible to use the variable message_fields as a list. For example, ${message_fields[0]}$ yields the message type "$GPGGA", and ${message_fields[2]}$ contains the latitude. (To remove the check-sum, one could do something like $(echo $message | cut -d* -f 1) to recover the part of the message before *.)

* *

Using $IFS is not the only way of processing the data. Good ol’ friends cut, tr, and sed help just as much if you’re not planning to do extensive (pre)processing. Here, I just grab the data from the file/device #4:

while [ 1 ]
    read this_line 

    # if cr/lf bothers you, make it lf only
    # (os-specfic concern)
    this_line=$( echo $this_line | sed s/$'\r'//g )

    # get a precise time stamp
    # %N = nanoseconds
    ts=$(date +"%Y/%m/%d %H:%M:%S.%N")

    echo $ts $this_line >> full-log.txt

    # let us filter the current position
    if [[ "$this_line" =~ "GPRMC" ]]
        # ok, it looks like a GPS reading (may be void)
        # if field 3 is V, the reading is void (or maybe
        # only untrusted?), if it is A, then the position
        # is Active (and therefore given with confidence?)
        if [[ $(echo $this_line | cut -d, -f 4-6) != ",,," ]]
            # get latitude and longitude
            gps_pos=($(echo $this_line | \
                cut -d, -f 3-7 | \
                tr , ' ' | \
                sed 's/\(^0*\)\|\(\b0*\)//g'))
            # show
            echo $ts ${gps_pos[@]}
done <&4

And this yields something like

2013/03/23 00:10:55.381921982 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:10:55.416345523 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:10:55.451376341 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:10:55.486083583 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:10:55.521025127 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:10:56.334357097 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:10:57.323532397 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:10:58.325286590 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:10:59.321872511 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:11:00.328458852 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:11:01.328374601 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:11:02.328710428 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:11:03.327115325 A xxxx.1229 N yyyy.2298 W
2013/03/23 00:11:04.328974192 A xxxx.1229 N yyyy.2298 W

* *

I am not saying that you should do all your data processing using Bash, merely that for some things, it should not be frowned upon, it may still do the job quite nicely.

Further post-processing of the data is application specific, and could be done in just about anything. As long as you capture as much as the available data (the above snippet time-stamps the data and stores it in another file, full-log.txt) you should be just fine. And since text isn’t that heavy (and high compressible in this case), you should really feel at ease to grab just everything.

Tools like Gnumeric (or excel) can load the log file by treating it as a space-or-comma separated file, and you can use that to look at the data. Let’s do just that.

But in a next entry. To be continued…

8 Responses to Parsing GPS data with Bash

  1. […] previous installments, we looked at how to read data from a serial-over-USB GPS, then how to parse the data. However, […]

  2. […] a few previous entries, we looked at how to capture, parse, and evaluate the precision of a NMEA-capable GPS. The […]

  3. […] a few other entries, I’ve toyed with GPS, either getting or parsing the data with Bash, assessing or using the GPS data. However, when we use GPS, we suppose that the […]

  4. Trevor Young says:

    A bit of an old post. I do a lot of nmea logging/parsing in bash and I’ve found the following to be a good way to do it, and might be helpful to someone.

    I organize things up neatly into functions and try to stick to bash builtins for speed. Like you mention, read the data and put it into an array. Then I see what sentence came in. If it is something I am interested in, validate the checksum, parse the array indexes I want, and echo out to the file. Using a case statement with a function for each type of data line lets you have custom behavior for each type of data if you want.

    set -o errexit
    set -o pipefail
    set -o nounset
    readonly LOGFILE=(your logfile)
    fn_read_data() {
        local LINE ARRAY
        ## Read data
        while read -r LINE; do
            ## Strip out non printable chars
            ## Chop LINE into an array
            ## include * char so checksum is a
            ## separate array index
            IFS=' ,*' ARRAY=(${LINE}) || continue
            ## What kind of sentence is it?
            case "${ARRAY[0]:-}" in
                \$??GGA)   fn_parse_gga ;;
                ## Other sentences of interest...
                ## Chuck anything else
                *) continue ;;
            unset LINE ARRAY
        done &gt; ${LOGFILE}
    fn_verify_xor_checksum() {
        ## NMEA 0183 style checksum verification
        ## Function expects input sentence as arg
        ## Function will return a 1 if checksum is invalid
        ## Process the input string
        ## Checksum is after the last *
        ## Checksum should be only 2 chars
        [[ ${#NMEA_CKSUM} = 2 ]] || return 1
        ## Loop through STRING, convert each char to ascii val and xor.
        ## Checksumming should start at second character (x=1) to skip 
        ## the starting $
        for (( x=1; x&lt;&quot;${LEN}&quot;; x++ )); do       ## C style for loop
            ## Stop if you hit the checksum, just in case...
            [[ ${STRING:$x:3} = \*${NMEA_CKSUM} ]] &amp;&amp; break
            ## Convert char to ascii value
            printf -v NEXTVAL '%d' &quot;'${STRING:$x:1}&quot;
            ## xor with the running value
            (( XOR^=&quot;${NEXTVAL}&quot; ))
        ## Convert final value into hex
        printf -v GEN_CKSUM '%02X' &quot;${XOR}&quot;
        ## Compare calculated checksum to existing one
        [[ ${NMEA_CKSUM} = ${GEN_CKSUM} ]] || return 1
    ## execute the script

    A bit long in the tooth, but adds some extra checksum verification and ability to do additional things with other lines of data. Pair this with the socat command and you can set up some really cool data piping/parsing/logging.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: