Contest Log Analysis

So - can we step back from the Contest logger and try and see the bigger picture ? Yes.

Data Sources

To start with I will assume that you only have the following

Dx Cluster/Skimmer data file

99% of all contests need this, and if you are using pen and paper - or Excel... well it's the 21st centuary - get a contest logger - and move with the times.

The data processing tools we will use are

Open Source
Free
Work on just about every Software platform

Analysis Software

You will need to install the following

Python3

I strongly suggest that you install python using a venv mechanism (please Google how to do this).

Python 3 Modules

As a bare minimum you will need to add the following modules into your venv Python3 environment.

Numpy
Pandas
Matplotlib
Jupyter

To set up an env from scratch (Linux/Mac) it is this

cd 
python3 -m venv ~/.pecontest
source ~/.pecontest/bin/activate
# We now are using our private python instalation
# So no need for sudo/root commands
pip3 install numpy pandas matplotlib jupyter 
# That is it

Load the data

So with a python environment setup, we start like this

cd 
python3 -m venv ~/.pecontest
source ~/.pecontest/bin/activate
mkdir ~/contest-data
cp <Your Cabrillo File> ~/contest-data/<contest_name>
cd ~/contest-data
jupyter notebook

We now should have a Jupter notebook loaded.

This is what my datafile look like (this is Skimmer format)

2020-09-26 01:55:36Z   14085.8  VR2CC       26-Sep-2020 0155Z   23 dB  45 BPS  DE RTTY       <DU3TW-#>
2020-09-26 01:55:38Z   14090.8  YB8UTI      26-Sep-2020 0155Z   21 dB  45 BPS  DE RTTY       <DU3TW-#>
2020-09-26 01:55:44Z   14085.8  VR2CC       26-Sep-2020 0155Z   22 dB  45 BPS  CQ RTTY       <DU3TW-#>
2020-09-26 01:55:52Z   21090.7  JA4XHF/3    26-Sep-2020 0155Z   18 dB  45 BPS     RTTY       <DU3TW-#>
2020-09-26 01:55:53Z   14090.8  YB8UTI      26-Sep-2020 0155Z   17 dB  45 BPS  CQ RTTY       <DU3TW-#>
2020-09-26 01:55:55Z   21098.1  YB2MM       26-Sep-2020 0155Z   12 dB  45 BPS  CQ RTTY       <DU3TW-#>
2020-09-26 01:55:55Z   21098.7  YB2MM       26-Sep-2020 0155Z   20 dB  45 BPS  CQ RTTY       <DU3TW-#>
2020-09-26 01:55:55Z   21100.0  YB2MM       26-Sep-2020 0155Z   11 dB  45 BPS  CQ RTTY       <DU3TW-#>
2020-09-26 01:55:55Z   21098.1  YB2MM       26-Sep-2020 0155Z   12 dB  45 BPS  CQ RTTY       <DU3TW-#>
2020-09-26 01:55:55Z   21098.7  YB2MM       26-Sep-2020 0155Z   20 dB  45 BPS  CQ RTTY       <DU3TW-#>
2020-09-26 01:55:56Z   21100.0  YB2MM       26-Sep-2020 0155Z   11 dB  45 BPS  CQ RTTY       <DU3TW-#>
2020-09-26 01:55:59Z   14083.9  YC1WCK      26-Sep-2020 0155Z   21 dB  45 BPS  CQ RTTY       <DU3TW-#>
2020-09-26 01:56:02Z   14083.3  JA2FSM      26-Sep-2020 0156Z   18 dB  45 BPS  DE RTTY       <DU3TW-#>
2020-09-26 01:56:02Z   14084.9  JA2FSM      26-Sep-2020 0156Z   19 dB  45 BPS  DE RTTY       <DU3TW-#>
2020-09-26 01:56:04Z   14083.3  JA2FSM      26-Sep-2020 0156Z   18 dB  45 BPS  DE RTTY       <DU3TW-#>
2020-09-26 01:56:04Z   14084.9  JA2FSM      26-Sep-2020 0156Z   19 dB  45 BPS  DE RTTY       <DU3TW-#>
2020-09-26 01:56:08Z   14089.8  JA1RRA      26-Sep-2020 0156Z   14 dB  45 BPS     RTTY       <DU3TW-#>
2020-09-26 01:56:11Z   21098.1  YB2MM       26-Sep-2020 0156Z   11 dB  45 BPS  CQ RTTY       <DU3TW-#>
2020-09-26 01:56:11Z   21098.7  YB2MM       26-Sep-2020 0156Z   19 dB  45 BPS  CQ RTTY       <DU3TW-#>
2020-09-26 01:56:12Z   21098.1  YB2MM       26-Sep-2020 0156Z   11 dB  45 BPS  CQ RTTY       <DU3TW-#>
2020-09-26 01:56:12Z   21098.7  YB2MM       26-Sep-2020 0156Z   19 dB  45 BPS  CQ RTTY       <DU3TW-#>
2020-09-26 01:56:16Z   14088.9  VU2DED      26-Sep-2020 0156Z   22 dB  45 BPS     RTTY       <DU3TW-#>
2020-09-26 01:56:19Z   14084.0  JA2FSM      26-Sep-2020 0156Z   27 dB  45 BPS     RTTY       <DU3TW-#>

Preparing the data

This first step is making this Skimmer/Telnet cluster data file into a standard pandas Dataframe.

This is using a set of python modules that I wrote (ham.dxcc) available on my github page .

import pandas as pd
from ham.dxcc import DxccAll
from ham.band import HamBand
import pickle


dx=DxccAll()

with open("cqwwrtty2020.txt","rt") as infile:
    data=infile.read().split('\n')


from dataclasses import dataclass
@dataclass
class RttySkimmer:
    when:str
    freq:float
    call:str
    when2:str
    sigdb:int
    baudrate:str
    cqde: str
    mode: str


class RttySkimmerLoader:
    '''2020-09-26 01:55:36Z   14085.8  VR2CC       26-Sep-2020 0155Z   23 dB  45 BPS  DE RTTY       <DU3TW-#>'''
    '''                                                                                                    1'''
    '''          1         2         3         4         5         6         7         8         9         0'''
    '''01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890'''
    def __init__(self, list_of_lines):
        self.obj=[]
        self.data=[]
        fields=[(0,20),
                 (22,30),
                 (32,42),
                 (43,62),
                 (63,66),
                 (70,73),
                 (78,81),
                 (82,90)]
        for n in list_of_lines:
            parts = [n[f[0]:f[1]].strip() for f in fields] 
            self.data.append(parts)
            self.obj.append(RttySkimmer(*parts))

    def names(self):
        return ['when','freq','call','when2','sigdb','baudrate','cqde','mode']

    def get_obj(self):
        return self.obj

    def get_data(self):
        return self.data


rtty = RttySkimmerLoader(data)   # Chop the file into RttySkimmer objects
#rtty_data_list=rtty.get_obj()  # Get these objects back
contest=pd.DataFrame.from_records(rtty.get_data())
contest.columns=rtty.names()
#
# Change some data types
# This could be done in the importer function....
contest['freq'] = contest.freq.apply(lambda x: float('0'+x))
contest['sigdb'] = contest.sigdb.apply(lambda x: int('0'+x))
contest['baudrate'] = contest.baudrate.apply(lambda x: int('0'+x))
contest['when'] = contest.when.apply(lambda x: pd.to_datetime(x))
contest['when2'] = contest.when2.apply(lambda x: pd.to_datetime(x))
# Check the top of the Dataframe
contest.head()

# Add some extra fields
#

def get_country(call):
    fnd = dx.find(call)
    if fnd and fnd.Country_Name:
        return fnd.Country_Name
    else:
        return ""

def get_cqzone(call):
    fnd = dx.find(call)
    if fnd and fnd.CQ_Zone:
        return fnd.CQ_Zone
    else:
        return 0


def get_lat(call):
    fnd = dx.find(call)
    if fnd and fnd.Latitude:
        return fnd.Latitude
    else:
        return 0


def get_lon(call):
    fnd = dx.find(call)
    if fnd and fnd.Longitude:
        return -1*fnd.Longitude
    else:
        return 0

def get_continent(call):
    fnd = dx.find(call)
    if fnd and fnd.Continent_Abbreviation:
        return fnd.Continent_Abbreviation
    else:
        return ""

def get_time_hour(when):
    """ Round the time to the nearest hour"""
    return when.round('h').hour   

band = HamBand()
# Fields I create have upper Case Name. So I can tell if it is source or Inferred data.
print("Calculating Country")
contest['Country']=contest.call.apply(lambda x: get_country(x))
print("Calculating Band")
contest['Band']=contest.freq.apply(lambda x: band.khz_to_m(x))
print("Calculating CQZone")
contest['CQZone']=contest.call.apply(lambda x: get_cqzone(x))

print("Calculating Lat")
contest['Lat']=contest.call.apply(lambda x: get_lat(x))
print("Calculating Lon")
contest['Lon']=contest.call.apply(lambda x: get_lon(x))
print("Calculating Continent")
contest['Cont']=contest.call.apply(lambda x: get_continent(x))
print("Calculating Rounded Hour")
contest['ZHour']=contest.when.apply(lambda x: get_time_hour(x))

print("Saving Data")
with open('contest.pkl',"wb") as ofp:
    pickle.dump(contest,ofp)
print("Done")

We now have a pickle file - which has a constant format

	when	freq	call	when2	sigdb	baudrate	cqde	mode	Country	Band	CQZone	Lat	Lon	Cont	ZHour
0	2020-09-26 01:55:36+00:00	14085.8	VR2CC	2020-09-26 01:55:00+00:00	23	45	DE	RTTY	Hong Kong	20.0	24	22.28	114.18	AS	2.0
1	2020-09-26 01:55:38+00:00	14090.8	YB8UTI	2020-09-26 01:55:00+00:00	21	45	DE	RTTY	Indonesia	20.0	28	-7.30	109.88	OC	2.0
2	2020-09-26 01:55:44+00:00	14085.8	VR2CC	2020-09-26 01:55:00+00:00	22	45	CQ	RTTY	Hong Kong	20.0	24	22.28	114.18	AS	2.0
3	2020-09-26 01:55:52+00:00	21090.7	JA4XHF/3	2020-09-26 01:55:00+00:00	18	45		RTTY	Japan	15.0	25	36.40	138.38	AS	2.0
4	2020-09-26 01:55:53+00:00	14090.8	YB8UTI	2020-09-26 01:55:00+00:00	17	45	CQ	RTTY	Indonesia	20.0	28	-7.30	109.88	OC	2.0

and has some extra columns - such as

sigdb
Baudrate (Rtty Contest)
cqde
Country
Band
CQ Zone
Their Lat/Lon
Continent

At this point we are ready to start looking into the data, which will be covered in part 2 of log analysis but as this will be using some geo data we need to add some more files

Prerequisites

This is for a Mac... (assuming you are using your pecontest version of python).

brew install gdal
pip install pyproj==1.9.6
pip install  geopandas seaborn