Tom West Top Cutting

Tom West Top Cutting#

import geolime as geo
import seaborn as sns
import numpy as np
dh_tom = geo.read_file("../data/dh_tom_comp.geo")
sns.ecdfplot(x=dh_tom["Zn_pct"], log_scale=True, stat='percent')
<Axes: ylabel='Percent'>
../_images/10837fc14b6b66eea496932f9fc3bb5a9b8eda966671e39570ca11c5e1a49ca7.png

Using Numpy we can compute the coefficient of variation.

Note that we use the nanXXX function such as nanmean and nanstd to account the fact we have nan values in the array. If we were to use mean and std the results would be nan.

np.nanstd(dh_tom['Zn_pct']) / np.nanmean(dh_tom['Zn_pct'])
0.7653989945584244
np.nanmean(dh_tom['Zn_pct'])
4.901401374850275

The nan filter can also be made at the GeoLime level.

np.mean(dh_tom.data('Zn_pct', region="Zn_pct.notna()"))
4.901401374850275

To analyse the different capping values, we can also perform a loop and compute the new mean after each capping.

We can create a range of 50 values between the minimum and the maximum values.

capped_mean = []
capped_values = np.linspace(
    start=np.nanmin(dh_tom['Zn_pct']), 
    stop=np.nanmax(dh_tom['Zn_pct']), 
    num=50
)
for cap in capped_values:
    zn_intermed = dh_tom['Zn_pct']
    zn_intermed[zn_intermed > cap] = cap
    capped_mean.append(np.nanmean(zn_intermed))

sns.scatterplot(x=capped_values, y=capped_mean)
<Axes: >
../_images/229e48a9e81c72d0c937de90c331e455ef3d7681d0505f207e83d7fbcd19c321.png

We can also create a linear spacing of 50 values between the 3rd quartile and the maximum of the value to zoom in in the most interesting part of the distribution.

capped_mean = []
capped_values = np.linspace(
    start=np.nanquantile(dh_tom['Zn_pct'], 0.75), 
    stop=np.nanmax(dh_tom['Zn_pct']), 
    num=50
)
for cap in capped_values:
    zn_intermed = dh_tom['Zn_pct']
    zn_intermed[zn_intermed > cap] = cap
    capped_mean.append(np.nanmean(zn_intermed))

sns.lineplot(x=capped_values, y=capped_mean, )
<Axes: >
../_images/d01196a3f7c51cb6c7220a33032468757b4746eb38d0165a95ea54b01a8afa86.png
np.nanstd(dh_tom['Zn_pct'])
3.7515276842376797

Once the capping value determined, we can create a new property to capp the value above 25.

dh_tom.set_property(name='Zn_cap', data='Zn_pct')
dh_tom.update_property(name='Zn_cap', data='25', region="(Zn_cap > 25)")
np.nanstd(dh_tom['Zn_cap'])
3.7428208073766127