Declustering#

import pandas as pd
import geolime as geo
from pyproj import CRS
import numpy as np
import plotly.graph_objects as go
from plotly.subplots import make_subplots

geo.Project().set_crs(CRS("EPSG:20350"))

Data loading#

dh = geo.read_file("../data/domained_drillholes.geo")

Declustering#

To avoid the biased difference that comes from preferential sampling concentrated in area of interest we need to decluster the data set. Samples should be located at regular gridded intervals or at random. The first thing to consider in the declustering step is the cell size. To choose the best celle size, it is great to assess the declustered mean for various celle size and choose a value that minimize or maximize the declustered mean. The algorithm used to decluster the data set will asses weight to every sample in order to have the minimun declustered mean.

Naive/declustered data#

Naive data refers to the raw data set, without any declustering while the declustered data refers to the one which have incorporated the declustering weights for each sample.

Moving window#

This method consists in counting the number of points (also called neighbors) ni in the volume Vi centered on the point (xi,yi,zi).

fig  = geo.plot(
    dh, 
    property="Fe_pct", 
    agg_method="mean",
    xaxis=dict(    
        showline=True,
        linewidth=2,
        linecolor="black",
        mirror=True,
        showgrid=True, 
        gridwidth=.2, 
        gridcolor="lightgrey"
    ),
    yaxis=dict(   
        showline=True,
        linewidth=2,
        linecolor="black",
        mirror=True,
        showgrid=True, 
        gridwidth=.2, 
        gridcolor="lightgrey"
    ) 
)
fig.update_layout(plot_bgcolor="rgba(0,0,0,0)")
fig