Journal
Learning log: canopy health baselines
Mapping Honey Locust distribution and exploring early ideas for canopy health prediction.
This week I focused on integrating tree canopy data with spatial occurrence records from the GBIF API to begin forming a baseline understanding of Honey Locust distribution and potential stress indicators.
Highlights
- Data visualization. Built interactive maps in Python using Folium — one for point distribution and another for density (heatmap) visualization of Honey Locust sightings across Argentina. These are now part of the project’s exploratory data analysis stage.
- Data alignment. Started normalizing canopy-related datasets (imagery tiles, parcel boundaries, and inspection data) so they can be joined later for feature extraction.
- Model planning. Outlined an initial idea for a small neural network baseline that could eventually use canopy-derived features — such as spectral indices or inspection text — as input signals for predicting canopy stress or overall health.
- Personal outreach. Planning to share the project with my dad, who’s an Agricultural Engineer specialized in forestry and has a strong interest in the Honey Locust. His feedback will likely help refine which environmental variables are worth tracking.
Next steps
- Prototype a data quality dashboard to quickly visualize missing imagery or incomplete coordinates.
- Experiment with a multimodal data approach (combining canopy imagery and inspection notes).
- Document the data ingestion and cleaning pipeline — from GBIF queries to Folium visualizations.
Data workflow
The source dataset was collected from the GBIF API, querying occurrences of Gleditsia triacanthos (Honey Locust) within Argentina.
After exporting the results to CSV, I cleaned the data using pandas to ensure valid geographic coordinates and remove null entries before feeding it into the visualization scripts.
Two simple Python scripts handle the visualization:
visualize_map.py→ plots individual occurrences as green markers.visualize_heatmap.py→ generates a density-based heatmap using the same coordinates.
Both outputs are saved as interactive HTML maps (honey_locust_map.html and honey_locust_heatmap.html), making it easy to explore spatial patterns directly in the browser.
In this context, “canopy signals” refer to the different features derived from canopy-level data — such as spectral information, canopy cover indices, or inspection-based metrics — that could later be used to model tree health.