Sensitivity area evaluation

Import modules and functions

Select which station(s) to include in the analysis

Set the date range and the hour(s) for the analysis

Sensitivity area evaluation

Sensitivity within user defined distances

Percent sensitivity to edge of domain

Sensitivity time series, averages by months and hours, and correlation to the size of the anthropogenic component

The option to update the dictionaries means to update the custom dictionaries which in turn can be used in the notebook "Station fingerprints".
Note that only one custom dictionary can be saved for each of the categories: Updating the dictionaries means to remove the old values

Import modules and functions

%run ~/STILT_modules_v2.5.ipynb
%run ~/ICOS_atmObs_modules_v2.5.ipynb
%run ~/for_thesis.ipynb

stations = create_STILT_dictionary()

import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.pyplot import figure
import matplotlib
import math
import numpy as np
import numpy
from mpl_toolkits.basemap import Basemap
import xlrd
import math
from netCDF4 import Dataset

#path to footprints used many places:
pathFP='/opt/stiltdata/fsicos2/stiltweb/'

Functions defined for handling STILT output:
add_old_STILTid
atc_query
atc_stationlist
available_STILT_dictionary
byte2string
checklib
createDatetimeObjList
create_STILT_dictionary
date_range_date_hour
display
figure
func_YYYY_to_YY
func_date_format_str
func_list_datetimeObj
get_ICOS_filename
get_ICOS_list
get_station_class
icosDatadf
icosMetadatadf
is_number
lonlat_2_ixjy
plot_available_STILT
plot_icos_stilt_timeseries
plot_maps
plot_maps_basemap
plot_maps_population
plot_stilt_timeseries
read_ICOS_atm_downloaded
read_ICOS_zipfile
read_STILT_dictionary
read_aggreg_footprints
read_emissions
read_stilt_raw_timeseries
read_stilt_timeseries
read_stilt_ts
read_stilt_ts_old
select_land_cover_classes
select_stations
str2dataframe
unzip
what_stations_see_dic2

Functions defined for handling ICOS time series:
add_old_STILTid
atc_query
atc_stationlist
available_STILT_dictionary
byte2string
checklib
createDatetimeObjList
create_STILT_dictionary
date_range_date_hour
display
figure
func_YYYY_to_YY
func_date_format_str
func_list_datetimeObj
get_ICOS_filename
get_ICOS_list
get_station_class
icosDatadf
icosMetadatadf
is_number
lonlat_2_ixjy
plot_available_STILT
plot_icos_stilt_timeseries
plot_maps
plot_maps_basemap
plot_maps_population
plot_stilt_timeseries
read_ICOS_atm_downloaded
read_ICOS_zipfile
read_STILT_dictionary
read_aggreg_footprints
read_emissions
read_stilt_raw_timeseries
read_stilt_timeseries
read_stilt_ts
read_stilt_ts_old
select_land_cover_classes
select_stations
str2dataframe
unzip
what_stations_see_dic2

Select which stations to include in the analysis

list_stations_for_analysis=select_stations()

Once done with station selection, run this cell to finalize selection

It is also possible to write the name of one, or several, more stations to base the analysis on

only_selection_stations=list_stations_for_analysis.value

all_stations=[]
for value in only_selection_stations:
    all_stations.append(value)
    
additional_station=input('Write additional station(s) here, otherwise leave blank. If several, seperate by whitespace: ')

list_additional_station = additional_station.split()
for item in list_additional_station:
    all_stations.append(str(item))
    
print('Your selection is: ', all_stations)

icos_stations = [ist[0:3] for ist in all_stations]

ICOS_files = get_ICOS_filename(icos_stations,download=True)

Write additional station(s) here, otherwise leave blank. If several, seperate by whitespace: 
Your selection is:  ['HTM150', 'PRS', 'JUE', 'HEI']
download all files:  0     ICOS_ATC_L2_L2-2019.1_HTM_70.0_CTS_CO2.zip
1     ICOS_ATC_L2_L2-2019.1_HTM_30.0_CTS_CO2.zip
2    ICOS_ATC_L2_L2-2019.1_HTM_150.0_CTS_CO2.zip
Name: fileName, dtype: object
file already exists, skip...ICOS_ATC_L2_L2-2019.1_HTM_70.0_CTS_CO2.zip
file already exists, skip...ICOS_ATC_L2_L2-2019.1_HTM_30.0_CTS_CO2.zip
file already exists, skip...ICOS_ATC_L2_L2-2019.1_HTM_150.0_CTS_CO2.zip
all done

Set the date range and the hour(s) for the analysis

date_range, timeselect, start_date, end_date =  date_range_date_hour()

date_index_number = (len(date_range) - 1)

Choose start date year (vary for station, earliest 2006): 2017
Choose start date month (write 1-12): 1
Choose start date day (write number): 1
Choose end date year (vary for station, earliest 2006): 2018
Choose end date month (write 1-12): 1
Choose end date day (write number): 1
Choose which footrpints to display: (0 3 6 9 12 15 18 21) 0 3 6 9 12 15 18 21

Back to top

1A. Sensitivity within user defined area limits

This cell aims to look at how much of the sensitivity at the stations are within user defined distances. Equirectangular approximation is used to calculate distances to the STILT cells from the stations to determine if the cell values should be included or not

The user defines three distances and the sensitivity within those distances are added up and displayed in bar graphs. The two output bar graphs contain the same data, but the second one the result for each station is stacked.

#save in a dictonary:
dictionary_sensitivity_area={}

import time
#user put in distances
distances_pbl = input("Choose THREE distances from the station to check influence, print smallest to largest. Space between each entry: ")

update_dictionaries=input('Do you want to update the dictornaries with the current selections? yes for yes')

distances_pbl_list = distances_pbl.split()
distances_pbl_list = [int(a) for a in distances_pbl_list] 

start = time.time()

#excel file with the cols/rows of stations.
ExcelFileName= 'stilt_stations_col_row.xls'
workbook = xlrd.open_workbook(ExcelFileName)
worksheet = workbook.sheet_by_name('stilt_stations_col_row')
num_rows = worksheet.nrows 

num_cols = worksheet.ncols

#for each of the stations, add to these lists:
station_list=[]

average_footprint_1_list=[]
average_footprint_2_list=[]
average_footprint_3_list=[]

#the total of all cells
averaged_summed_footprint_list=[]

#just for the lat/lon values:
f_fp= Dataset('final_netcdf_point_source_emission.nc')

dictionary_sensitivity_area['info']={}

dictionary_sensitivity_area['info']['date_min']=(str(date_range[0].year) + '-' + str(date_range[0].month) + '-' + str(date_range[0].day))

dictionary_sensitivity_area['info']['date_max']=(str(date_range[date_index_number].year) + '-' + str(date_range[date_index_number].month) + '-' + str(date_range[date_index_number].day))

dictionary_sensitivity_area['info']['hours']=(str(timeselect))

dictionary_sensitivity_area['info']['stations']=all_stations

dictionary_sensitivity_area['info']['distances']=distances_pbl_list

#loop over all selected stations:
my_iterator=iter(all_stations)
for ist in my_iterator:
    
    dictionary_sensitivity_area[ist]={}
    
    dictionary_sensitivity_area[ist]['percent']={}
    
    dictionary_sensitivity_area[ist]['percent']={}
    
    dictionary_sensitivity_area[ist]['percent']={}
    
    dictionary_sensitivity_area[ist]['absolute']={}
    
    dictionary_sensitivity_area[ist]['absolute']={}
    
    dictionary_sensitivity_area[ist]['absolute']={}
    
    #these are to put distances to cells in. Min found. Needed to do in same loop for checking all three distances. --> probably only need the max!
    value_1_row=[]
    value_2_row=[]
    value_3_row=[]

    value_1_col=[]
    value_2_col=[]
    value_3_col=[]
    
    count = 0
    
    within_distance_1_total=0
    within_distance_2_total=0
    within_distance_3_total=0
    
    #probably remove 
    summed_footprint_total = 0
    
    #for each of the dates for the current station, add to these lists:
    date_list=[]
    percent_1_list=[]
    percent_2_list=[]
    percent_3_list=[]

    within_distance_1_list=[]
    within_distance_2_list=[]
    within_distance_3_list=[]

    summed_footprint_list=[]
    
    #this is to locate the station column/row (=also lat/lon) in the excel file:
    for curr_row in range(0, num_rows, 1):
        
        #the station names are in the second (index 1) column:
        if worksheet.cell_value(curr_row, 1) == ist:
            

            #the "matched station" associated column and row in the footprint matrix is in column 3 (index 2) and column 4 (index 3)
            station_col = worksheet.cell_value(curr_row, 2)
            station_row = worksheet.cell_value(curr_row, 3)

            #point one (measure from) is the one of the station. Measure from the "bottom" right corner of the station cell? 
            #measure to the bottom right in that case, so maybe that is fine.
            #there is code somewhere to "move" the lat/lon dimensions halfway though the cell in each direction otherwise. 
            
            lon1= f_fp.variables['lon'][station_col]
            lat1= f_fp.variables['lat'][station_row]
            
            #how many cells away from the station cell. Which one to loop to. 
            for value in range(1,399,1):
                
                #station col+value must be within STILT domain. 
                if (station_col+value)>399:
                    break
                    
                #check how many cells away, by checking distance to the furthest away value:
                
                #lat2=lat1 --> same row (lat1+lat1 in the equation). Just moving along the different columns, check radius of circle. 
                #add if over 399 or whatever... break. have to send an message too
                
                
                #check distance in longitude (column... only one that changes here).
                lon2=f_fp.variables['lon'][station_col+value]

                R = 6373.8
                
                #latitude remain same.... just moving to furthest away points
                x = math.radians(lon1-lon2)*math.cos(math.radians(lat1+lat1)/2)

                y=math.radians(lat1-lat1)
                distance_col = math.sqrt((x*x)+(y*y)) * R
                        
                if distance_col>int(distances_pbl_list[2]):
                    value_3_col.append(value)
                    break
                        
            #these are the values that decide how "far away" to loop to the furthest end of the circle. 
            #the lowers value is still "passing" the value, do minus 1 to be within.
            loop_to_col_3= (min(value_3_col)-1)
            
    #always the same distance in row --> only need to do it once for all the stations (will change depending on what distances the user puts in of course!)
    for value in range(1,479,1):
        
        #make sure staying within the domain
        if (station_row+value)>479:
            
            break

        lat2=f_fp.variables['lat'][station_row+value]

        R = 6373.8

        x = math.radians(lon1-lon1)*math.cos(math.radians(lat1+lat2)/2)

        y=math.radians(lat1-lat2)
        distance_row = math.sqrt((x*x)+(y*y)) * R

        if distance_row>int(distances_pbl_list[2]):

            value_3_row.append(value)
            break

    #easiest way to get the summed footprint values: the aggregate. Do not have it for individual footprints. 
    #used in the presentation, but also to calculate percentage. 
    #the "fp" here also contains all the cell values that will be added within the different circles to determine the sensitivity.
    nfp, fp, lon, lat, title2 = read_aggreg_footprints(ist, date_range, timeselect=timeselect)
    
    summed_footprint= fp[0].sum()

    loop_to_row_3= (min(value_3_row)-1)

      
    #Have a "square" defined in which the "circle" around the station is. 
    #loop over all the cells in the square and add to the correct "within distance" the user defined.
    within_distance_1= 0
    within_distance_2= 0
    within_distance_3= 0
    
    
    #loop over all the cells in the square. 
    #range, looping until the STOP of the range, but not including--> +1. 
    for row in range(int(station_row-loop_to_row_3), (int(station_row+loop_to_row_3)+1), 1):

        for col in range(int(station_col-loop_to_col_3), (int(station_col + loop_to_col_3)+1),1):  

            #add here --> check distance to the cell within the rectangle. 
            lon3=f_fp.variables['lon'][col]
            
            #now, moving in different latitude directions also, not just on a line like when determining the square. 
            lat3=f_fp.variables['lat'][row]

            x = math.radians(lon1-lon3)*math.cos(math.radians(lat1+lat3)/2)

            y=math.radians(lat1-lat3)
            distance_within_rectangle_1 = math.sqrt((x*x)+(y*y)) * R

            #if less than this! --> add
            if distance_within_rectangle_1<int(distances_pbl_list[0]):


                within_distance_1= within_distance_1 + fp[0][row][col]
                
            if distance_within_rectangle_1<int(distances_pbl_list[1]):
                
                within_distance_2= within_distance_2 + fp[0][row][col]    
                
            if distance_within_rectangle_1<int(distances_pbl_list[2]):
                within_distance_3= within_distance_3 + fp[0][row][col] 
                
        
    #percent, used only in the map representations commented out now, can change and use.        
    if within_distance_1 > 0 and summed_footprint>0:
        percent_1= (within_distance_1/summed_footprint)*100
    else:
        percent_1=0
    if within_distance_2 > 0 and summed_footprint>2:
        percent_2= (within_distance_2/summed_footprint)*100
    else:
        percent_2=0
    if within_distance_3 > 0 and summed_footprint>0:
        percent_3= (within_distance_3/summed_footprint)*100
    else:
        percent_3=0
        
        
    average_footprint_1_list.append(within_distance_1)
    average_footprint_2_list.append(within_distance_2)
    average_footprint_3_list.append(within_distance_3)
    
    #put in dictionary
    dictionary_sensitivity_area[ist]['percent'][distances_pbl_list[0]]=percent_1
    dictionary_sensitivity_area[ist]['percent'][distances_pbl_list[1]]=percent_2
    dictionary_sensitivity_area[ist]['percent'][distances_pbl_list[2]]=percent_3
    
    dictionary_sensitivity_area[ist]['absolute'][distances_pbl_list[0]]=within_distance_1
    dictionary_sensitivity_area[ist]['absolute'][distances_pbl_list[1]]=within_distance_2
    dictionary_sensitivity_area[ist]['absolute'][distances_pbl_list[2]]=within_distance_3
    dictionary_sensitivity_area[ist]['absolute']['total']=summed_footprint
    

    
    averaged_summed_footprint_list.append(summed_footprint)

    
    lat_station= stations[ist]['lat']


    lon_station= stations[ist]['lon']
    title = ('Average footprint ' + str(ist) +  '\n' +  str(start_date.year)+ '-'+ str(start_date.month) + '-' + str(start_date.day) + ' to ' + str(end_date.year) + '-' + str(end_date.month) + '-' + str(end_date.day)+ \
             ', Hour(s): ' + str(timeselect) +  '\n' + 'Average sensitivity to whole domain: ' + str("%.2f" % summed_footprint) +  '\n' + 'Average within '+ str(distances_pbl_list[0]) + ' km: ' + str("%.2f" % within_distance_1) \
             + ' (' + str("%.2f" % percent_1) + '%)' +  '\n' + 'Average within ' + str(distances_pbl_list[1]) + ' km: ' + str("%.2f" % within_distance_2) + ' (' + str("%.2f" % percent_2) + '%)' + '\n' + 'Average within ' +\
             str(distances_pbl_list[2]) +  'km: ' + str("%.2f" % within_distance_3) + ' (' + str("%.2f" % percent_3) + '%)')
    
    #plot_maps_population(fp, lon, lat, lon_station, lat_station, title=title, label='surface influence', unit='[ppm / ($\mu$mol / m$^{2}$s)]', linlog='linear', vmin=-6, vmax=-2, station=[ist], zoom=ist)
    plot_maps_population(fp, lon, lat, lon_station, lat_station, title=title, label='surface influence', unit='[ppm / ($\mu$mol / m$^{2}$s)]', linlog='linear', station=[ist], zoom=ist)



sorted_list_stations = [x for _,x in sorted(zip(averaged_summed_footprint_list,icos_stations))]

sorted_averaged_summed_footprint_list = [x for _,x in sorted(zip(averaged_summed_footprint_list,averaged_summed_footprint_list))]

sorted_average_footprint_1_list = [x for _,x in sorted(zip(averaged_summed_footprint_list,average_footprint_1_list))]

sorted_average_footprint_2_list = [x for _,x in sorted(zip(averaged_summed_footprint_list,average_footprint_2_list))]

sorted_average_footprint_3_list = [x for _,x in sorted(zip(averaged_summed_footprint_list,average_footprint_3_list))]


matplotlib.rcParams.update({'font.size': 16})

figure(num=None, figsize=(25, 12), dpi=80, facecolor='w', edgecolor='k')
y_pos = np.arange(len(sorted_list_stations))
ax = plt.subplot(111)


p1 = ax.bar(y_pos-0.3, sorted_averaged_summed_footprint_list,width=0.2,color='gainsboro',align='center')
p2 = ax.bar(y_pos-0.1, sorted_average_footprint_3_list,width=0.2,color='darkgrey',align='center')
p3 = ax.bar(y_pos+0.1, sorted_average_footprint_2_list,width=0.2,color='dimgrey',align='center')
p4 = ax.bar(y_pos+0.3, sorted_average_footprint_1_list,width=0.2,color='black',align='center')

ax.legend((p1[0], p2[0], p3[0], p4[0]), ('average summed footprints', 'average within ' + str(distances_pbl_list[2]) + 'km', 'average within ' + str(distances_pbl_list[1]) + 'km', 'average within ' + str(distances_pbl_list[0]) + 'km'))
plt.xticks(y_pos, sorted_list_stations)

plt.ylabel('All footprint cells summed')

date_index_number = (len(date_range) - 1)

plt.title('STILT Model estimated sensitivity to the model domain' + '\n' + str(date_range[0].year) + '-' + str(date_range[0].month) + '-' + str(date_range[0].day)\
            + ' to ' + str(date_range[date_index_number].year) + '-' + str(date_range[date_index_number].month) + '-' + str(date_range[date_index_number].day)+\
          ' Hour(s): ' + timeselect)

plt.show()


#stacked bar, same as above in terms of data:

for_within_all=sorted_average_footprint_3_list + sorted_average_footprint_2_list + sorted_average_footprint_1_list

for_within_3= sorted_average_footprint_2_list + sorted_average_footprint_1_list

for_2_reduced=[x - y for x, y in zip(sorted_average_footprint_2_list,sorted_average_footprint_1_list)]


for_3_reduced=[x - y for x, y in zip(sorted_average_footprint_3_list,for_within_3)]

for_all_reduced= [x - y for x, y in zip(sorted_averaged_summed_footprint_list, for_within_all)]


figure(num=None, figsize=(20, 10), dpi=80, facecolor='w', edgecolor='k')
y_pos = np.arange(len(sorted_list_stations))
ax = plt.subplot(111)

p1= plt.bar(y_pos, for_all_reduced, bottom= sorted_average_footprint_3_list, width=0.25,color='gainsboro',align='center')
p2= plt.bar(y_pos, for_3_reduced, bottom=sorted_average_footprint_2_list, width=0.25,color='darkgrey',align='center')
p3= plt.bar(y_pos, for_2_reduced, bottom=sorted_average_footprint_1_list, width=0.25,color='dimgrey',align='center')
p4= plt.bar(y_pos, sorted_average_footprint_1_list,width=0.25,color='black',align='center')


ax.legend((p1[0], p2[0], p3[0], p4[0]), ('average summed footprints', 'average within ' + str(distances_pbl_list[2]) + 'km', 'average within ' + str(distances_pbl_list[1]) + 'km', 'average within ' + str(distances_pbl_list[0]) + 'km'))
plt.xticks(y_pos, sorted_list_stations, rotation='vertical')
 
plt.ylabel('All footprint cells summarized')

plt.title('STILT Model estimated sensitivity to the model domain' + '\n' + str(date_range[0].year) + '-' + str(date_range[0].month) + '-' + str(date_range[0].day)\
            + ' to ' + str(date_range[date_index_number].year) + '-' + str(date_range[date_index_number].month) + '-' + str(date_range[date_index_number].day)+\
          ' Hour(s): ' + timeselect)


plt.show()

import pickle

end = time.time()
#print(end - start)


if update_dictionaries=='yes':
    
    #dict = dictionary_point_source
    f = open("sensitivity_area_evaluation_custom.pkl","wb")
    pickle.dump(dictionary_sensitivity_area,f)
    f.close()

Choose THREE distances from the station to check influence, print smallest to largest. Space between each entry: 100 200 300
Do you want to update the dictornaries with the current selections? yes for yes

Back to top

Percent sensitivity to edge of domain

In this cell, the stations are ranked in terms of their percent sensitivity to the outermost part of the domain, defined as the rows and columns with the highest and lowest values

list_valid_stations=[]
list_percent_outer_by_station=[]
my_iterator=iter(all_stations)

for station in my_iterator:
    count=0
    tot_by_station=0
    tot_whole_by_station=0
    
    for date in date_range:

        filename=(pathFP+'slots/'+stations[station]['locIdent']+'/'+str(date.year)+'/'+str(date.month).zfill(2)+'/'
         +str(date.year)+'x'+str(date.month).zfill(2)+'x'+str(date.day).zfill(2)+'x'+str(date.hour).zfill(2)+'/foot')
        
        if os.path.isfile(filename):
        
            #access the footprint given the date/time.
            f_fp = cdf.Dataset(filename)
            
            fp=f_fp.variables['foot'][:,:,:]
            
            #sum of all cells - to calculate percentage
            tot_whole_by_date=fp.sum()
            
            bottom=fp[0][0][:].sum()
            top=fp[0][479][:].sum()
            
            left=fp[0][:][0].sum()
            right=fp[0][:][399].sum()
            
            tot_by_date=bottom+top+left+right

            tot_whole_by_station=tot_whole_by_station+tot_whole_by_date
        
            tot_by_station=tot_by_station+tot_by_date

            #for average
            count=count+1
            
    if count>0:
        list_valid_stations.append(station)

        average_whole_by_station=tot_whole_by_station/count
        average_by_station=tot_by_station/count
        
        
        percent_outer_by_station=(average_by_station/average_whole_by_station)*100
        
        list_percent_outer_by_station.append(percent_outer_by_station)  
    
    
sorted_list_valid_stations = [x for _,x in sorted(zip(list_percent_outer_by_station,list_valid_stations))]


sorted_list_percent_by_station=[x for _,x in sorted(zip(list_percent_outer_by_station,list_percent_outer_by_station))]

matplotlib.rcParams.update({'font.size': 19})

x_pos = numpy.arange(len(sorted_list_valid_stations))

fig, ax = plt.subplots(figsize=(25,10))

p1 = ax.bar(x_pos, sorted_list_percent_by_station,color='tomato',align='center')
   
ax.set_ylabel('Outer cells % of total sensitivity')
ax.set_xticks(x_pos)
ax.set_xticklabels(sorted_list_valid_stations,rotation='vertical')

ax.set_title('Average % sensitivity to the outermost cell rows and columns')

ax.yaxis.grid(True)

plt.show()

Back to top

Sensitivity time series, averages by months and hours, and correlation to the size of the anthropogenic component

Here, the sensitivity values of each individual footprint are considered. The user defines a distance within which the sensitivity will be summarized. The values are in turn displayed in a time series as well as sorted by hour and month to display averages in bar graphs. The same is true for the footprints sensitivities to the whole domain.
One final aspect considered is the relationship between the total sensitivity values and the anthropogenic component for each of the footprint. This is set up in a linear regression model. Resulting R2 values (how much of the variation in the anthropogenic component can be explained by the difference in total sensitivity) are many times high, indicating that the two variables have a positive correlation.

distances_pbl = input("Choose the distance to check for time series (km): ")

distances_pbl_list = distances_pbl.split()
distances_pbl_list = [int(a) for a in distances_pbl_list] 


#excel file with the cols/rows of stations.
ExcelFileName= 'stilt_stations_col_row.xls'
workbook = xlrd.open_workbook(ExcelFileName)
worksheet = workbook.sheet_by_name('stilt_stations_col_row')
num_rows = worksheet.nrows 

num_cols = worksheet.ncols

#for each of the stations, add to these lists:
station_list=[]

my_iterator=iter(all_stations)

#just for the lat/lon values:
f_fp= Dataset('class_522_forgotten.nc')

#for each station... 
for ist in my_iterator:
    
    average_footprint_1_list=[]
    
    averaged_summed_footprint_list=[]    

    list_anthro_value=[]
    list_dates_antho=[]

    #these are to put distances to cells in. Min found. Needed to do in same loop for checking all three distances.
    value_1_row=[]


    value_1_col=[]


    count = 0

    within_distance_1_total=0


    summed_footprint_total = 0

    #for each of the dates for the current station, add to these lists:
    date_list=[]
    percent_1_list=[]

    within_distance_1_list=[]


    summed_footprint_list=[]

    #by month and by hour:
    #within defined distance
    jan_count=0 
    jan_tot_list=[]
    feb_count=0
    feb_tot_list=[]
    mar_count=0
    mar_tot_list=[]
    apr_count=0
    apr_tot_list=[]
    may_count=0
    may_tot_list=[]
    jun_count=0
    jun_tot_list=[]
    jul_count=0
    jul_tot_list=[]
    aug_count=0
    aug_tot_list=[]
    sep_count=0
    sep_tot_list=[]
    oct_count=0
    oct_tot_list=[]
    nov_count=0
    nov_tot_list=[]
    dec_count=0
    dec_tot_list=[]
    
    #whole stilt domain 
    jan_count_whole=0 
    jan_tot_list_whole=[]
    feb_count_whole=0
    feb_tot_list_whole=[]
    mar_count_whole=0
    mar_tot_list_whole=[]
    apr_count_whole=0
    apr_tot_list_whole=[]
    may_count_whole=0
    may_tot_list_whole=[]
    jun_count_whole=0
    jun_tot_list_whole=[]
    jul_count_whole=0
    jul_tot_list_whole=[]
    aug_count_whole=0
    aug_tot_list_whole=[]
    sep_count_whole=0
    sep_tot_list_whole=[]
    oct_count_whole=0
    oct_tot_list_whole=[]
    nov_count_whole=0
    nov_tot_list_whole=[]
    dec_count_whole=0
    dec_tot_list_whole=[]
    
    #by hour
    count_0=0 
    tot_list_0=[]
    count_3=0
    tot_list_3=[]
    count_6=0
    tot_list_6=[]
    count_9=0
    tot_list_9=[]
    count_12=0
    tot_list_12=[]
    count_15=0
    tot_list_15=[]
    count_18=0
    tot_list_18=[]
    count_21=0
    tot_list_21=[]

    #whole stilt domain 
    count_whole_0=0 
    tot_list_whole_0=[]
    count_whole_3=0
    tot_list_whole_3=[]
    count_whole_6=0
    tot_list_whole_6=[]
    count_whole_9=0
    tot_list_whole_9=[]
    count_whole_12=0
    tot_list_whole_12=[]
    count_whole_15=0
    tot_list_whole_15=[]
    count_whole_18=0
    tot_list_whole_18=[]
    count_whole_21=0
    tot_list_whole_21=[]
    
    
    #count nan --> number of nan-values by month (will be in labels for the different months) 
    count_nan_jan=0
    count_nan_feb=0
    count_nan_mar=0
    count_nan_apr=0
    count_nan_may=0
    count_nan_jun=0
    count_nan_jul=0
    count_nan_aug=0
    count_nan_sep=0
    count_nan_oct=0
    count_nan_nov=0
    count_nan_dec=0
    
    count_nan_0=0
    count_nan_3=0
    count_nan_6=0
    count_nan_9=0
    count_nan_12=0
    count_nan_15=0
    count_nan_18=0
    count_nan_21=0


    #this is to locate the station column/row in the excel file:
    for curr_row in range(0, num_rows, 1):

        #the station names are in the second (index 1) column:
        if worksheet.cell_value(curr_row, 1) == ist:

            #the "matched station" associated column and row in the footprint matrix is in column 3 (index 2) and column 4 (index 3)
            station_col = worksheet.cell_value(curr_row, 2)
            station_row = worksheet.cell_value(curr_row, 3)

            #point one (measure from) is the one of the station. Measure from the "bottom" right corner of the station cell? 
            #measure to the bottom right in that case, so maybe that is fine.
            #there is code somewhere to "move" the lat/lon dimensions halfway though the cell in each direction otherwise. 
            lon1= f_fp.variables['lon'][station_col]
            lat1= f_fp.variables['lat'][station_row]

            #if distances_pbl_list[0]==1000:
             #   print ('Calculating the total')
                
            #else:
            
            #what column value is the furthest away within the distance?
            for value in range(1,399,1):

                #station col+value must be within STILT domain. 
                if (station_col+value)>399:
                    break

                #check how many cells away:
                #lat2=lat1 --> same row when checking for the "maximum" away - r of the circle
                #add if over 399 or whatever... break. have to send an message too


                #check distance in longitude (row... only one that should change. confirm when tool running. move lat to before the loop since same for all stations.)
                lon2=f_fp.variables['lon'][station_col+value]

                R = 6373.8

                x = math.radians(lon1-lon2)*math.cos(math.radians(lat1+lat1)/2)

                y=math.radians(lat1-lat1)
                distance_col = math.sqrt((x*x)+(y*y)) * R

                #only interested in the furthest away?
                if distance_col>int(distances_pbl_list[0]):

                    #then do the min in this list!
                    value_1_col.append(value)
                    break


            #these are the values that decide how "far away" to loop.
            loop_to_col_1= min(value_1_col)


    #always the same distance in row --> only need to do it once
    for value in range(1,479,1):
        #check how many cells away:
        if (station_row+value)>479:
            break

        lat2=f_fp.variables['lat'][station_row+value]

        R = 6373.8

        x = math.radians(lon1-lon1)*math.cos(math.radians(lat1+lat2)/2)

        y=math.radians(lat1-lat2)
        distance_row = math.sqrt((x*x)+(y*y)) * R

        if distance_row>int(distances_pbl_list[0]):

            #then do the min in this list!
            value_1_row.append(value)
            break

    loop_to_row_1= min(value_1_row)


    #now that we know how far to loop away, we do it for each data in the date_range (for each station)
    #####################################################
    
    #for each footprint in that date range:
    for date in date_range:
        
        filename=(pathFP+'slots/'+stations[ist]['locIdent']+'/'+str(date.year)+'/'+str(date.month).zfill(2)+'/'
         +str(date.year)+'x'+str(date.month).zfill(2)+'x'+str(date.day).zfill(2)+'x'+str(date.hour).zfill(2)+'/foot')
        
  
        if os.path.isfile(filename):
        
            #access the footprint given the date/time.
            f_fp = cdf.Dataset(filename)
            
            date_list.append(date)
            
            #this is new --> check the sensitivity for each seperate footprint 
            fp=f_fp.variables['foot'][:,:,:]


            summed_footprint= fp[0].sum()
            
            #by month
            if date.month==1:
                jan_count_whole=jan_count_whole+1
                jan_tot_list_whole.append(summed_footprint)
            if date.month==2:
                feb_count_whole=feb_count_whole+1
                feb_tot_list_whole.append(summed_footprint)
            if date.month==3:
                mar_count_whole=mar_count_whole+1
                mar_tot_list_whole.append(summed_footprint)
            if date.month==4:
                apr_count_whole=apr_count_whole+1
                apr_tot_list_whole.append(summed_footprint)
            if date.month==5:
                may_count_whole=may_count_whole+1
                may_tot_list_whole.append(summed_footprint)
            if date.month==6:
                jun_count_whole=jun_count_whole+1
                jun_tot_list_whole.append(summed_footprint)
            if date.month==7:
                jul_count_whole=jul_count_whole+1
                jul_tot_list_whole.append(summed_footprint)
            if date.month==8:
                aug_count_whole=aug_count_whole+1
                aug_tot_list_whole.append(summed_footprint)
            if date.month==9:
                sep_count_whole=sep_count_whole+1
                sep_tot_list_whole.append(summed_footprint)
            if date.month==10:
                oct_count_whole=oct_count_whole+1
                oct_tot_list_whole.append(summed_footprint)
            if date.month==11:
                nov_count_whole=nov_count_whole+1
                nov_tot_list_whole.append(summed_footprint)
            if date.month==12:
                dec_count_whole=dec_count_whole+1   
                dec_tot_list_whole.append(summed_footprint)
            
            #by hour
            if date.hour==0:
                count_whole_0=count_whole_0+1
                tot_list_whole_0.append(summed_footprint)
            if date.hour==3:
                count_whole_3=count_whole_3+1
                tot_list_whole_3.append(summed_footprint)
            if date.hour==6:
                count_whole_6=count_whole_6+1
                tot_list_whole_6.append(summed_footprint)
            if date.hour==9:
                count_whole_9=count_whole_9+1
                tot_list_whole_9.append(summed_footprint)
            if date.hour==12:
                count_whole_12=count_whole_12+1
                tot_list_whole_12.append(summed_footprint)
            if date.hour==15:
                count_whole_15=count_whole_15+1
                tot_list_whole_15.append(summed_footprint)
            if date.hour==18:
                count_whole_18=count_whole_18+1
                tot_list_whole_18.append(summed_footprint)
            if date.hour==21:
                count_whole_21=count_whole_21+1
                tot_list_whole_21.append(summed_footprint)
           

            within_distance_1= 0


            #can maybe define range with the loop to row 3... 
            #happens once for every date:
            for row in range(fp.shape[1]):

                for col in range(fp.shape[2]):

                    if row>(station_row-loop_to_row_1) and row<(station_row+loop_to_row_1+1) and col>(station_col-loop_to_col_1) and col<(station_col+loop_to_col_1+1):
                        
                        within_distance_1= within_distance_1 + fp[0][row][col]
            if date.month==1:
                jan_count=jan_count+1
                jan_tot_list.append(within_distance_1)
            if date.month==2:
                feb_count=feb_count+1
                feb_tot_list.append(within_distance_1)
            if date.month==3:
                mar_count=mar_count+1
                mar_tot_list.append(within_distance_1)
            if date.month==4:
                apr_count=apr_count+1
                apr_tot_list.append(within_distance_1)
            if date.month==5:
                may_count=may_count+1
                may_tot_list.append(within_distance_1)
            if date.month==6:
                jun_count=jun_count+1
                jun_tot_list.append(within_distance_1)
            if date.month==7:
                jul_count=jul_count+1
                jul_tot_list.append(within_distance_1)
            if date.month==8:
                aug_count=aug_count+1
                aug_tot_list.append(within_distance_1)
            if date.month==9:
                sep_count=sep_count+1
                sep_tot_list.append(within_distance_1)
            if date.month==10:
                oct_count=oct_count+1
                oct_tot_list.append(within_distance_1)
            if date.month==11:
                nov_count=nov_count+1
                nov_tot_list.append(within_distance_1)
            if date.month==12:
                dec_count=dec_count+1   
                dec_tot_list.append(within_distance_1)
                
            #by hour    
            if date.hour==0:
                count_0=count_0+1
                tot_list_0.append(within_distance_1)
            if date.hour==3:
                count_3=count_3+1
                tot_list_3.append(within_distance_1)
            if date.hour==6:
                count_6=count_6+1
                tot_list_6.append(within_distance_1)
            if date.hour==9:
                count_9=count_9+1
                tot_list_9.append(within_distance_1)
            if date.hour==12:
                count_12=count_12+1
                tot_list_12.append(within_distance_1)
            if date.hour==15:
                count_15=count_15+1
                tot_list_15.append(within_distance_1)
            if date.hour==18:
                count_18=count_18+1
                tot_list_18.append(within_distance_1)
            if date.hour==21:
                count_21=count_21+1
                tot_list_21.append(within_distance_1)



            average_footprint_1_list.append(within_distance_1)

            
            averaged_summed_footprint_list.append(summed_footprint)
            
            #also append the 
        
        #add to count nan if date is nan:
        else:
            if date.month==1:
                count_nan_jan=count_nan_jan+1
            if date.month==2:
                count_nan_feb=count_nan_feb+1
            if date.month==3:
                count_nan_mar=count_nan_mar+1
            if date.month==4:
                count_nan_apr=count_nan_apr+1
            if date.month==5:
                count_nan_may=count_nan_may+1
            if date.month==6:
                count_nan_jun=count_nan_jun+1
            if date.month==7:
                count_nan_jul=count_nan_jul+1
            if date.month==8:
                count_nan_aug=count_nan_aug+1
            if date.month==9:
                count_nan_sep=count_nan_sep+1
            if date.month==10:
                count_nan_oct=count_nan_oct+1
            if date.month==11:
                count_nan_nov=count_nan_nov+1
            if date.month==12:
                count_nan_dec=count_nan_dec+1
            
            #nan by hour
            if date.hour==0:
                count_nan_0=count_nan_0+1
            if date.hour==3:
                count_nan_3=count_nan_3+1
            if date.hour==6:
                count_nan_6=count_nan_6+1
            if date.hour==9:
                count_nan_9=count_nan_9+1
            if date.hour==12:
                count_nan_12=count_nan_12+1
            if date.hour==15:
                count_nan_15=count_nan_15+1
            if date.hour==18:
                count_nan_18=count_nan_18+1
            if date.hour==21:
                count_nan_21=count_nan_21+1
    #also get standard deviation values! 
    
    standard_deviation_within_distance = numpy.nanstd(average_footprint_1_list)
    
    standard_deviation_whole_summed = numpy.nanstd(averaged_summed_footprint_list)
    
    
    average_within_distance= numpy.nanmean(average_footprint_1_list)
    
    average_deviation_whole_summed= numpy.nanmean(averaged_summed_footprint_list)
    
    
    #time series: summed footprints within defined distance. 
    matplotlib.rcParams.update({'font.size': 17})
    fig = p.figure(figsize=(25,10))
    ax = fig.add_subplot(111)


    #HEREE
    p.plot(date_list, average_footprint_1_list, linestyle='-',color='g')


    p.axhline(0, color='black', lw=1)

    date_index_number = (len(date_range) - 1)

    p.title('Sensitivity within ' + str(distances_pbl_list[0]) + ' km, by footprint, for station ' + str(ist) + '\n' + str(date_range[0].year) + '-' + str(date_range[0].month) + '-' + str(date_range[0].day)\
           + ' to ' + str(date_range[date_index_number].year) + '-' + str(date_range[date_index_number].month) + '-' + str(date_range[date_index_number].day)\
         + ', Hour(s): ' + timeselect + '\n' + 'Average: ' + str("%.2f" % average_within_distance) + '\n' + 'Standard deviation: ' + str("%.2f" % standard_deviation_within_distance))

    #ax.set_xlim(start_date,end_date)
    ax.set_ylabel('All footprints summarized ')
    ax.grid(axis='x')

    p.show()
    
    #prepare data for bar graph to show by month wihtin the certain distances
    by_month_averages=[]
    by_month_names_for_label=[]
    if jan_count>0:
       
        by_month_averages.append(numpy.mean(jan_tot_list))
        
        by_month_names_for_label.append('Jan' + ' (' + str(count_nan_jan) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(jan_tot_list)))

    if feb_count>0:
        by_month_averages.append(numpy.mean(feb_tot_list))
        
        by_month_names_for_label.append('Feb' + ' (' + str(count_nan_feb) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(feb_tot_list)))

    if mar_count>0:
        by_month_averages.append(numpy.mean(mar_tot_list))
        
        by_month_names_for_label.append('Mar' + ' (' + str(count_nan_mar) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(mar_tot_list)))
    if apr_count>0:
        by_month_averages.append(numpy.mean(apr_tot_list))
        
        by_month_names_for_label.append('Apr' + ' (' + str(count_nan_apr) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(apr_tot_list)))
    if may_count>0:
        by_month_averages.append(numpy.mean(may_tot_list))
        
        by_month_names_for_label.append('May' + ' (' + str(count_nan_apr) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(apr_tot_list)))
    if jun_count>0:
        by_month_averages.append(numpy.mean(jun_tot_list))
        
        by_month_names_for_label.append('Jun' + ' (' + str(count_nan_jun) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(jun_tot_list)))
    if jul_count>0:
        by_month_averages.append(numpy.mean(jul_tot_list))
        
        by_month_names_for_label.append('Jul' + ' (' + str(count_nan_jul) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(jul_tot_list)))
    if aug_count>0:
        by_month_averages.append(numpy.mean(aug_tot_list))
        
        by_month_names_for_label.append('Aug' + ' (' + str(count_nan_aug) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(aug_tot_list)))
    if sep_count>0:
        by_month_averages.append(numpy.mean(sep_tot_list))
        
        by_month_names_for_label.append('Sep' + ' (' + str(count_nan_sep) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(sep_tot_list)))
    if oct_count>0:
        by_month_averages.append(numpy.mean(oct_tot_list))
        
        by_month_names_for_label.append('Oct' + ' (' + str(count_nan_oct) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(oct_tot_list)))
    if nov_count>0:
        by_month_averages.append(numpy.mean(nov_tot_list))
        
        by_month_names_for_label.append('Nov' + ' (' + str(count_nan_nov) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(nov_tot_list)))
    if dec_count>0:
        by_month_averages.append(numpy.mean(dec_tot_list))
        
        by_month_names_for_label.append('Dec' + ' (' + str(count_nan_dec) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(dec_tot_list)))
    

    #display by month/by hour for within distance:
    N = len(by_month_averages)

    ind = np.arange(N)    
    width = 0.35    

    #added figure stuff for size change
    from matplotlib.pyplot import figure

    figure(num=None, figsize=(11, 7), dpi=80, facecolor='w', edgecolor='k')

    ax = plt.subplot(111)
    ax.yaxis.grid(True)

    plt.bar(ind, by_month_averages, width, color='blue')
    
    matplotlib.rcParams.update({'font.size': 12})
    plt.ylabel('All footprints summed')

    plt.xlabel('Month (count NaN)')
    #added figsize


    #for title --> use variables of last cell as well (ex number of no measurements. nan here can also just mean no cells for it)
    plt.title(ist +' summed footprint cells within ' + str(distances_pbl_list[0]) + ' km' + '\n' + str(date_range[0].year) + '-' + str(date_range[0].month) + '-' + str(date_range[0].day)\
             + ' to ' + str(date_range[date_index_number].year) + '-' + str(date_range[date_index_number].month) + '-' + str(date_range[date_index_number].day) + ', Hour(s): ' + timeselect)
    plt.xticks(ind, by_month_names_for_label, rotation='vertical')

    plt.axhline(0, color='black', lw=2)

    plt.show()
    
    #prepare to show the same, within defined distance but by hour:
    
    by_hour_averages=[]
    by_hour_names_for_label=[]

    if count_0>0:
        by_hour_averages.append(numpy.mean(tot_list_0))
        by_hour_names_for_label.append('0' + ' (' + str(count_nan_0) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(tot_list_0)))

    if count_3>0:
        by_hour_averages.append(numpy.mean(tot_list_3))
        by_hour_names_for_label.append('3' + ' (' + str(count_nan_3) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(tot_list_3)))
    
    if count_6>0:
        by_hour_averages.append(numpy.mean(tot_list_6))
        by_hour_names_for_label.append('6' + ' (' + str(count_nan_6) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(tot_list_6)))
        
    if count_9>0:
        by_hour_averages.append(numpy.mean(tot_list_9))
        by_hour_names_for_label.append('9' + ' (' + str(count_nan_9) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(tot_list_9)))
        
    if count_12>0:
        by_hour_averages.append(numpy.mean(tot_list_12))
        by_hour_names_for_label.append('12' + ' (' + str(count_nan_12) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(tot_list_12)))
        
    if count_15>0:
        by_hour_averages.append(numpy.mean(tot_list_15))
        by_hour_names_for_label.append('15' + ' (' + str(count_nan_15) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(tot_list_15)))
        
    if count_18>0:
        by_hour_averages.append(numpy.mean(tot_list_18))
        by_hour_names_for_label.append('18' + ' (' + str(count_nan_18) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(tot_list_18)))
        
    if count_21>0:
        by_hour_averages.append(numpy.mean(tot_list_21))
        by_hour_names_for_label.append('21' + ' (' + str(count_nan_21) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(tot_list_21)))


    N = len(by_hour_averages)

    ind = np.arange(N)    
    width = 0.35    

    #added figure stuff for size change
    from matplotlib.pyplot import figure

    figure(num=None, figsize=(11, 7), dpi=80, facecolor='w', edgecolor='k')

    ax = plt.subplot(111)
    ax.yaxis.grid(True)

    plt.bar(ind, by_hour_averages, width, color='green')
    matplotlib.rcParams.update({'font.size': 12})
    plt.ylabel('All footprints summed')

    plt.xlabel('Hour (count NaN)')
    #added figsize


    #for title --> use variables of last cell as well (ex number of no measurements. nan here can also just mean no cells for it)
    plt.title(ist +' summed footprint cells within ' + str(distances_pbl_list[0]) + ' km' + '\n' + str(date_range[0].year) + '-' + str(date_range[0].month) + '-' + str(date_range[0].day)\
             + ' to ' + str(date_range[date_index_number].year) + '-' + str(date_range[date_index_number].month) + '-' + str(date_range[date_index_number].day) + ', Hour(s): ' + timeselect)
    plt.xticks(ind, by_month_names_for_label, rotation='vertical')

    plt.axhline(0, color='black', lw=2)

    plt.show()
    
    
    #showing the summarized cells:
    matplotlib.rcParams.update({'font.size': 17})
    fig = p.figure(figsize=(25,10))
    ax = fig.add_subplot(111)


    p.plot(date_list, averaged_summed_footprint_list, linestyle='-',color='g')

    p.axhline(0, color='black', lw=1)

    date_index_number = (len(date_range) - 1)

    p.title('Sensitivity to the whole domain, by footprint, for station ' + str(ist) + '\n' + str(date_range[0].year) + '-' + str(date_range[0].month) + '-' + str(date_range[0].day)\
           + ' to ' + str(date_range[date_index_number].year) + '-' + str(date_range[date_index_number].month) + '-' + str(date_range[date_index_number].day)\
         + ', Hour(s): ' + timeselect + '\n' + 'Average: ' + str("%.2f" % average_deviation_whole_summed) + '\n' + 'Standard deviation: ' + str("%.2f" % standard_deviation_whole_summed))

    #ax.set_xlim(start_date,end_date)
    ax.set_ylabel('All footprints summarized ')
    
    ax.grid(axis='x')
    #ax.legend(loc='upper right')
    #ax.tick_params(axis='y', labelcolor='r')
    p.show()

    
    #by month, for whole:
    
    by_month_averages_whole=[]
    by_month_names_for_label_whole=[]
    
    if jan_count_whole>0:
       
        by_month_averages_whole.append(numpy.mean(jan_tot_list_whole))
        
        by_month_names_for_label_whole.append('Jan' + ' (' + str(count_nan_jan) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(jan_tot_list_whole)))

    if feb_count_whole>0:
        by_month_averages_whole.append(numpy.mean(feb_tot_list_whole))
        
        by_month_names_for_label_whole.append('Feb' + ' (' + str(count_nan_feb) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(feb_tot_list_whole)))

    if mar_count_whole>0:
        by_month_averages_whole.append(numpy.mean(mar_tot_list_whole))
        
        by_month_names_for_label_whole.append('Mar' + ' (' + str(count_nan_mar) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(mar_tot_list_whole)))
    if apr_count_whole>0:
        by_month_averages_whole.append(numpy.mean(apr_tot_list_whole))
        
        by_month_names_for_label_whole.append('Apr' + ' (' + str(count_nan_apr) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(apr_tot_list_whole)))
    if may_count_whole>0:
        by_month_averages_whole.append(numpy.mean(may_tot_list_whole))
        
        by_month_names_for_label_whole.append('May' + ' (' + str(count_nan_apr) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(apr_tot_list_whole)))
    if jun_count_whole>0:
        by_month_averages_whole.append(numpy.mean(jun_tot_list_whole))
        
        by_month_names_for_label_whole.append('Jun' + ' (' + str(count_nan_jun) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(jun_tot_list_whole)))
    if jul_count_whole>0:
        by_month_averages_whole.append(numpy.mean(jul_tot_list_whole))
        
        by_month_names_for_label_whole.append('Jul' + ' (' + str(count_nan_jul) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(jul_tot_list_whole)))
    if aug_count_whole>0:
        by_month_averages_whole.append(numpy.mean(aug_tot_list_whole))
        
        by_month_names_for_label_whole.append('Aug' + ' (' + str(count_nan_aug) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(aug_tot_list_whole)))
    if sep_count_whole>0:
        by_month_averages_whole.append(numpy.mean(sep_tot_list_whole))
        
        by_month_names_for_label_whole.append('Sep' + ' (' + str(count_nan_sep) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(sep_tot_list_whole)))
    if oct_count_whole>0:
        by_month_averages_whole.append(numpy.mean(oct_tot_list_whole))
        
        by_month_names_for_label_whole.append('Oct' + ' (' + str(count_nan_oct) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(oct_tot_list_whole)))
    if nov_count_whole>0:
        by_month_averages_whole.append(numpy.mean(nov_tot_list_whole))
        
        by_month_names_for_label_whole.append('Nov' + ' (' + str(count_nan_nov) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(nov_tot_list_whole)))
    if dec_count_whole>0:
        by_month_averages_whole.append(numpy.mean(dec_tot_list_whole))
        
        by_month_names_for_label_whole.append('Dec' + ' (' + str(count_nan_dec) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(dec_tot_list_whole)))
        
    print('whole, by month: ', by_month_averages_whole)
    N = len(by_month_averages_whole)

    ind = np.arange(N)    
    width = 0.35    

    #added figure stuff for size change
    from matplotlib.pyplot import figure

    figure(num=None, figsize=(11, 7), dpi=80, facecolor='w', edgecolor='k')

    ax = plt.subplot(111)
    ax.yaxis.grid(True)

    plt.bar(ind, by_month_averages_whole, width, color='blue')
    matplotlib.rcParams.update({'font.size': 12})
    plt.ylabel('All footprints summed')

    plt.xlabel('Month (count NaN)')
    #added figsize


    #for title --> use variables of last cell as well (ex number of no measurements. nan here can also just mean no cells for it)
    plt.title(ist +' summed footprints cells within whole STILT domain' + '\n' + str(date_range[0].year) + '-' + str(date_range[0].month) + '-' + str(date_range[0].day)\
             + ' to ' + str(date_range[date_index_number].year) + '-' + str(date_range[date_index_number].month) + '-' + str(date_range[date_index_number].day) + ', Hour(s): ' + timeselect)
    plt.xticks(ind, by_month_names_for_label_whole, rotation='vertical')

    plt.axhline(0, color='black', lw=2)

    plt.show()
    
    
    #by hour, whole:

    by_hour_averages_whole=[]
    by_hour_names_for_label_whole=[]
    
    if count_whole_0>0:
        by_hour_averages_whole.append(numpy.mean(tot_list_whole_0)) 
        by_hour_names_for_label_whole.append('0' + ' (' + str(count_nan_0) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(tot_list_whole_0)))

    if count_whole_3>0:
        by_hour_averages_whole.append(numpy.mean(tot_list_whole_3)) 
        by_hour_names_for_label_whole.append('3' + ' (' + str(count_nan_3) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(tot_list_whole_3)))
        
    if count_whole_6>0:
        by_hour_averages_whole.append(numpy.mean(tot_list_whole_6)) 
        by_hour_names_for_label_whole.append('6' + ' (' + str(count_nan_6) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(tot_list_whole_6)))
        
    if count_whole_9>0:
        by_hour_averages_whole.append(numpy.mean(tot_list_whole_9)) 
        by_hour_names_for_label_whole.append('9' + ' (' + str(count_nan_9) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(tot_list_whole_9)))
        
    if count_whole_12>0:
        by_hour_averages_whole.append(numpy.mean(tot_list_whole_12)) 
        by_hour_names_for_label_whole.append('12' + ' (' + str(count_nan_12) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(tot_list_whole_12)))
        
    if count_whole_15>0:
        by_hour_averages_whole.append(numpy.mean(tot_list_whole_15)) 
        by_hour_names_for_label_whole.append('15' + ' (' + str(count_nan_15) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(tot_list_whole_15)))
        
    if count_whole_18>0:
        by_hour_averages_whole.append(numpy.mean(tot_list_whole_18)) 
        by_hour_names_for_label_whole.append('18' + ' (' + str(count_nan_18) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(tot_list_whole_18)))
        
    if count_whole_21>0:
        by_hour_averages_whole.append(numpy.mean(tot_list_whole_21)) 
        by_hour_names_for_label_whole.append('21' + ' (' + str(count_nan_21) + ')' + '\n' + 'Std: ' + str("%.2f" % numpy.std(tot_list_whole_21)))
        

    N = len(by_hour_averages_whole)

    ind = np.arange(N)    
    width = 0.35    

    #added figure stuff for size change
    from matplotlib.pyplot import figure

    figure(num=None, figsize=(11, 7), dpi=80, facecolor='w', edgecolor='k')

    ax = plt.subplot(111)
    ax.yaxis.grid(True)

    plt.bar(ind, by_hour_averages_whole, width, color='green')
    matplotlib.rcParams.update({'font.size': 12})
    plt.ylabel('All footprints summed')

    plt.xlabel('Hour (count NaN)')
    #added figsize


    #for title --> use variables of last cell as well (ex number of no measurements. nan here can also just mean no cells for it)
    plt.title(ist +' summed footprints cells within whole STILT domain' + '\n' + str(date_range[0].year) + '-' + str(date_range[0].month) + '-' + str(date_range[0].day)\
             + ' to ' + str(date_range[date_index_number].year) + '-' + str(date_range[date_index_number].month) + '-' + str(date_range[date_index_number].day)+ ', Hour(s): ' + timeselect)
    plt.xticks(ind, by_hour_names_for_label_whole, rotation='vertical')

    plt.axhline(0, color='black', lw=2)

    plt.show()
    
    #compare to anthro data!
    df = read_stilt_timeseries(ist, date_range)
    if df.empty:
        print ('no STILT data for defined date range for '), ist
    else:

        date_index_number = (len(date_range) - 1)

        for value in df['co2.fuel']:
            if df['co2.fuel'].index[count] in date_range:
                date= df['co2.fuel'].index[count]
                
                if math.isnan(value)==False:
                    back_val=df['co2.background'][count]
                    stilt_val=df['co2.stilt'][count]
                    bio_val=df['co2.bio'][count]

                    #to get anthropogenic total:
                    industry_val_orig =df['co2.industry'][count]
                    energy_val_orig=df['co2.energy'][count]
                    transport_val_orig=df['co2.transport'][count]
                    others_val_orig=df['co2.others'][count]
                    anthropogenic_val_orig = industry_val_orig + energy_val_orig + transport_val_orig + others_val_orig
                    
                    list_anthro_value.append(anthropogenic_val_orig)
                    list_dates_antho.append(date)
                    
                count=count+1
    

    anthro_valid_both=[]
    sensitivty_valid_both=[]
    index_for_date=0
    for value in list_anthro_value:
        if math.isnan(value)==False:
            if list_dates_antho[index_for_date] in date_list:
                index_correct=date_list.index(list_dates_antho[index_for_date])
            
                if math.isnan(averaged_summed_footprint_list[index_correct])==False:
                    anthro_valid_both.append(value)
                    sensitivty_valid_both.append(averaged_summed_footprint_list[index_correct])
                    
        index_for_date=index_for_date+1
                    
    import numpy as np

    import statsmodels.api as sm

    import statsmodels.formula.api as smf

    x=anthro_valid_both
    y=sensitivty_valid_both

    results = sm.OLS(y, x).fit()
    
    print(ist, 'Anthropogenic emission as dependent on sensitivity of footprint:')
    print(results.summary())
    
    
    print('Min sensitivity: ', min(averaged_summed_footprint_list))
    
    print('Max sensitivity: ', max(averaged_summed_footprint_list))
    
    print('Min anthropogenic: ', min(list_anthro_value))
    
    print('Max anthropogenic: ', max(list_anthro_value))

Choose the distance to check for time series (km): 100

whole, by month:  [5.530341292745053, 5.66783520404838, 4.56501314891794, 3.93604444675296, 5.247959349437558, 3.9381345425711376, 4.240142108293991, 4.603674176897977, 4.803702428706413, 4.629100193021988, 4.8183922914024455, 4.954612413560292]

HTM150 Anthropogenic emission as dependent on sensitivity of footprint:
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.643
Model:                            OLS   Adj. R-squared:                  0.643
Method:                 Least Squares   F-statistic:                     5255.
Date:                Wed, 18 Mar 2020   Prob (F-statistic):               0.00
Time:                        14:31:00   Log-Likelihood:                -7539.8
No. Observations:                2920   AIC:                         1.508e+04
Df Residuals:                    2919   BIC:                         1.509e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
x1             1.4503      0.020     72.490      0.000       1.411       1.490
==============================================================================
Omnibus:                      436.534   Durbin-Watson:                   0.447
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             2314.569
Skew:                          -0.602   Prob(JB):                         0.00
Kurtosis:                       7.192   Cond. No.                         1.00
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Min sensitivity:  0.6715620010476686
Max sensitivity:  28.402065732215796
Min anthropogenic:  0.041575983111292986
Max anthropogenic:  17.153441044844072

whole, by month:  [1.7200858890152944, 1.9982413610232237, 2.0874068829586263, 2.463350010042122, 2.7355019637509987, 3.1162330361475985, 2.908493070986922, 2.6487888877051478, 2.8710530041261872, 1.8180491971240758, 1.8486001687299731, 1.6934173086874842]

PRS Anthropogenic emission as dependent on sensitivity of footprint:
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.764
Model:                            OLS   Adj. R-squared:                  0.764
Method:                 Least Squares   F-statistic:                     9318.
Date:                Wed, 18 Mar 2020   Prob (F-statistic):               0.00
Time:                        14:36:07   Log-Likelihood:                -5230.4
No. Observations:                2872   AIC:                         1.046e+04
Df Residuals:                    2871   BIC:                         1.047e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
x1             1.8121      0.019     96.530      0.000       1.775       1.849
==============================================================================
Omnibus:                      559.150   Durbin-Watson:                   0.656
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             5137.494
Skew:                          -0.653   Prob(JB):                         0.00
Kurtosis:                       9.421   Cond. No.                         1.00
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Min sensitivity:  4.5456696031560783e-05
Max sensitivity:  16.097354635704747
Min anthropogenic:  1.818525510984059e-07
Max anthropogenic:  10.581593556012416

whole, by month:  [6.038615066384741, 5.6516331399956155, 3.6167574260665583, 3.6098735222882707, 4.572031397257306, 3.3446918701048265, 4.0643330897519885, 4.171241609545716, 4.527692373272248, 4.04522899263168, 4.917704307774026, 4.963966698187049]

JUE Anthropogenic emission as dependent on sensitivity of footprint:
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.549
Model:                            OLS   Adj. R-squared:                  0.548
Method:                 Least Squares   F-statistic:                     3546.
Date:                Wed, 18 Mar 2020   Prob (F-statistic):               0.00
Time:                        14:41:11   Log-Likelihood:                -7756.8
No. Observations:                2920   AIC:                         1.552e+04
Df Residuals:                    2919   BIC:                         1.552e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
x1             0.1538      0.003     59.550      0.000       0.149       0.159
==============================================================================
Omnibus:                     1131.376   Durbin-Watson:                   0.675
Prob(Omnibus):                  0.000   Jarque-Bera (JB):            20344.746
Skew:                          -1.368   Prob(JB):                         0.00
Kurtosis:                      15.638   Cond. No.                         1.00
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Min sensitivity:  0.5757735794217561
Max sensitivity:  25.797560091154086
Min anthropogenic:  0.40398933009319526
Max anthropogenic:  259.280754053703

whole, by month:  [9.79426513103756, 8.839324623603616, 5.6197393294941005, 6.771162390941368, 7.4512192221076745, 5.89194982122019, 6.8526119814655715, 8.312364557652304, 9.268939276230816, 7.896227256939919, 7.398876363592872, 6.777548527592227]

HEI Anthropogenic emission as dependent on sensitivity of footprint:
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.314
Model:                            OLS   Adj. R-squared:                  0.314
Method:                 Least Squares   F-statistic:                     1335.
Date:                Wed, 18 Mar 2020   Prob (F-statistic):          4.29e-241
Time:                        14:46:24   Log-Likelihood:                -10217.
No. Observations:                2920   AIC:                         2.044e+04
Df Residuals:                    2919   BIC:                         2.044e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
x1             0.1311      0.004     36.544      0.000       0.124       0.138
==============================================================================
Omnibus:                     1487.308   Durbin-Watson:                   0.518
Prob(Omnibus):                  0.000   Jarque-Bera (JB):            87539.179
Skew:                          -1.622   Prob(JB):                         0.00
Kurtosis:                      29.627   Cond. No.                         1.00
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Min sensitivity:  0.7535971240312279
Max sensitivity:  40.64336455210985
Min anthropogenic:  0.5677177227771298
Max anthropogenic:  712.3404967397694

Sensitivity area evaluation

The option to update the dictionaries means to update the custom dictionaries which in turn can be used in the notebook "Station fingerprints". Note that only one custom dictionary can be saved for each of the categories: Updating the dictionaries means to remove the old values

Import modules and functions

Select which stations to include in the analysis

Once done with station selection, run this cell to finalize selection

Set the date range and the hour(s) for the analysis

1A. Sensitivity within user defined area limits

Percent sensitivity to edge of domain

Sensitivity time series, averages by months and hours, and correlation to the size of the anthropogenic component

The option to update the dictionaries means to update the custom dictionaries which in turn can be used in the notebook "Station fingerprints".
Note that only one custom dictionary can be saved for each of the categories: Updating the dictionaries means to remove the old values