moved to module

This commit is contained in:
Philip 2025-07-17 15:27:28 +02:00
parent 4e941467eb
commit 80c86db42b
31 changed files with 362 additions and 127 deletions

76
data/README.md Normal file
View File

@ -0,0 +1,76 @@
Get the full scoop at [NaturalEarthData.com](http://naturalearthdata.com)
_No, really! This readme is a poor substitute for the live site._
# About Natural Earth Vector
Natural Earth is a public domain map dataset available at 1:10m, 1:50m, and 1:110 million scales. Featuring tightly integrated vector (here) and raster data ([over there](https://github.com/nvkelso/natural-earth-raster)), with Natural Earth you can make a variety of visually pleasing, well-crafted maps with cartography or GIS software.
Natural Earth was built through a collaboration of many [volunteers](http://www.naturalearthdata.com/about/contributors/) and is supported by [NACIS](http://www.nacis.org/) (North American Cartographic Information Society), and is free for use in any type of project (see our [Terms of Use](http://www.naturalearthdata.com/about/terms-of-use/) page for more information).
[Get the Data »](http://www.naturalearthdata.com/downloads)
![Convenience](http://www.naturalearthdata.com/wp-content/uploads/2009/08/home_image_11.png)
# Convenience
Natural Earth solves a problem: finding suitable data for making small-scale maps. In a time when the web is awash in geospatial data, cartographers are forced to waste time sifting through confusing tangles of poorly attributed data to make clean, legible maps. Because your time is valuable, Natural Earth data comes ready-to-use.
![Neatness Counts](http://www.naturalearthdata.com/wp-content/uploads/2009/08/home_image_21.png)
# Neatness Counts
The carefully generalized linework maintains consistent, recognizable geographic shapes at 1:10m, 1:50m, and 1:110m scales. Natural Earth was built from the ground up so you will find that all data layers align precisely with one another. For example, where rivers and country borders are one and the same, the lines are coincident.
![GIS Atributes](http://www.naturalearthdata.com/wp-content/uploads/2009/08/home_image_32.png)
# GIS Attributes
Natural Earth, however, is more than just a collection of pretty lines. The data attributes are equally important for mapmaking. Most data contain embedded feature names, which are ranked by relative importance. Other attributes facilitate faster map production, such as width attributes assigned to river segments for creating tapers.
# Versioning
The 2.0 release in 2012 marked the project's shift from so-called marketing versions to [semantic versioning](http://semver.org/).
Natural Earth is a big project with hundreds of files that depend on each other and the total weighs in at several gigabytes. SemVer is a simple set of rules and requirements around version numbers. For our project, the data layout is the API.
* **Version format of X.Y.Z** (Major.Minor.Patch).
* **Backwards incompatible** changes, increment the major version X.
* **Backwards compatible** additions/changes, increment the minor version Y
* **Bug fixes** not affecting the file and field names, patch version Z will be incremented.
Major version increments:
* Changing existing data **file names**
* Changing existing data **column (field) names**
* Removing **`FeatureCla` field attribute values**
* Additions, deletions to **admin-0**
* Introduce **significant new themes**
Minor version increments:
* Any shape or attribute change in **admin-0**
* Additions, deletions, and any shape or attribute changes in **admin-1**
* Additions, deletions to **any theme**
* Major shape or attribute changes in **any theme**
* Adding, changing **`FeatureCla` field attribute values**
* Introduce **minor new themes**
Patch version increments:
* Minor shape or attribute changes in **any theme**
* Bug fixes to shape, attributes in **any theme**
Under this scheme, version numbers and the way they change convey meaning about the underlying code and what has been modified from one version to the next.
When we introduce a new version of Natural Earth, you can tell by the version number how much effort you will need to extend to integrate the data with your map implementation.
* **Bug fixes Z**: can simply use the new data files, replacing your old files.
* **Minor version Y**: limited integration challenges.
* **Major version X**: significatnt integration challenges, either around changed file strucutre, field layout, field values like `FeatureCla` used in symbolizing data, or significant new additions or significant changes to existing themes.
# &etc
Natural Earth is maintained by Nathaneiel V. KELSO ([@nvkelso](https://github.com/nvkelso/)) and Tom Patterson.
The project transitioned to Github in 2012. Versioned files are here to collaborate around. The frontend still lives at [NaturalEarthData.com](http://naturalearthdata.com).

13
pyproject.toml Normal file
View File

@ -0,0 +1,13 @@
[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"
[project]
name="energy_data_project"
version="0.1.0"
requires-python = ">3.10"
dependencies=[
]
[tool.setuptools.packages.find]
where = ["src"]

View File

@ -1,62 +0,0 @@
# pip/conda install geopandas
import geopandas as gpd
import matplotlib.pyplot as plt
world = gpd.read_file("../data/110m_cultural.zip", layer="ne_110m_admin_0_countries")
print(list(world.columns))
print(world.iloc[0, -1])
print(world.iloc[0]["NAME"])
print()
print(world.crs)
# EPSG:4326 (latitude/longitude)
# epsg:3587 Mercator
w2 = world.to_crs("epsg:3587")
print(w2.iloc[0, -1])
print(world[["NAME", "CONTINENT", "POP_EST", "GDP_MD", "geometry"]].head())
# wie groß ist jedes Land?
world_proj_cylinder = world.to_crs("epsg:3587") # zylinder
world_proj_cylinder["area_sq_km"] = world_proj_cylinder.geometry.area / 1_000_000
print(world_proj_cylinder[["NAME", "area_sq_km"]].head(10))
cities = gpd.read_file("../data/110m_cultural.zip",
layer="ne_110m_populated_places")
print(cities[["NAME", "geometry"]].head())
print(len(cities))
fig, ax = plt.subplots()
world.plot(column="CONTINENT", legend=True, ax=ax)
cities.plot(ax=ax, color="red")
plt.show()
# Nachbar von Land X suchen
def find_neighbouring_countries(df, country_name):
target = df[df["NAME"] == country_name]
if len(target) == 0:
print(f"{country_name} not found")
return []
# sjoin
neighbours = []
for idx, country in df.iterrows():
if country["NAME"] == country_name:
continue
if target.geometry.iloc[0].touches(country.geometry):
neighbours.append(country["NAME"])
return neighbours
res = find_neighbouring_countries(world, "Spain")
print(res)

View File

@ -1,55 +0,0 @@
import geopandas as gpd
import matplotlib.pyplot as plt
world = gpd.read_file("../data/110m_cultural.zip", layer="ne_110m_admin_0_countries")
cities = gpd.read_file("../data/110m_cultural.zip",
layer="ne_110m_populated_places")
def find_neighbouring_countries(df, country_name):
target = df[df["NAME"] == country_name]
if len(target) == 0:
print(f"{country_name} not found")
return gpd.GeoDataFrame()
# sjoin spatial-join
neighbours = gpd.sjoin(target, df, how="inner", predicate="touches")
return neighbours[["NAME_right", "CONTINENT_right", "geometry"]]
def get_close_cities(df, city_name):
target = df[df["NAMEASCII"] == city_name]
if len(target) == 0:
print(f"{city_name} not found")
return gpd.GeoDataFrame()
# sjoin spatial-join
neighbours = gpd.sjoin_nearest(target, df, max_distance=10e12)
return neighbours
#intersects
#contains
#within
#touches
#crosses
#overlaps
german_neighbours = find_neighbouring_countries(world, "Germany")
print(german_neighbours[["NAME_right", "CONTINENT_right"]])
print(len(german_neighbours))
print("-"*100)
w2 = world.to_crs("EPSG:3857")
c2 = cities.to_crs("EPSG:3857")
print(list(c2.columns))
german_neighbours = get_close_cities(c2, "Paris")
print(len(german_neighbours))
print(german_neighbours)
w2.plot(column="CONTINENT", legend=True)
plt.show()
# 1) get_cities_in_country (sjoin)
# 2) get_cities_in_countries (concat)

0
src/__init__.py Normal file
View File

View File

@ -54,7 +54,7 @@ def get_larger_cities_north_of_city(df, city: str, country: Optional[str]=None):
if __name__ == "__main__":
print("Country Population")
world_cities = pd.read_excel("../data/worldcities.xlsx")
world_cities = pd.read_excel("../../data/worldcities.xlsx")
print(world_cities.columns)
german_population = get_population_of_country(world_cities, "Germany")
print(f"Population of Germany: {german_population}")

View File

@ -1,6 +1,6 @@
import pandas as pd
beverages = pd.read_csv("../data/beverages.csv")#
beverages = pd.read_csv("../../data/beverages.csv")#
print(beverages)
@ -24,7 +24,7 @@ for name, info in groups:
print()
# print(help(pd.read_csv))
donations_df = pd.read_csv("../data/donations.csv")
donations_df = pd.read_csv("../../data/donations.csv")
print(donations_df)
subset = donations_df[["city", "job", "income", "donations"]]

View File

@ -1,6 +1,6 @@
import pandas as pd
energy_df = pd.read_csv("../data/germany_energy_mix_2019_2024.csv")
energy_df = pd.read_csv("../../data/germany_energy_mix_2019_2024.csv")
print(energy_df)
print(energy_df.columns)

View File

@ -24,7 +24,7 @@ pivoted_df = df.pivot(index="Product",
print(pivoted_df)
beverages = pd.read_csv("../data/beverages.csv")
beverages = pd.read_csv("../../data/beverages.csv")
beverages["Day"] = (["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"] * 35)[:103]
print(beverages)
@ -50,7 +50,7 @@ print(coffees)
# df.index <- name der index-spalte
print("\n"*3)
energy_df = pd.read_csv("../data/germany_energy_mix_2019_2024.csv")
energy_df = pd.read_csv("../../data/germany_energy_mix_2019_2024.csv")
# 1) Neue spalte
energy_df["Year Quarter"] = energy_df["Year"].astype(str) + " " + energy_df['Quarter']
# drop: axis=0 die zeilen namens [Year, quarter] gelöscht

View File

@ -3,7 +3,7 @@ import pandas as pd
# jahr-monat-tag
# monat/tag/jahr (us-schreibweise)
beverages_by_date = pd.read_csv("../data/beverages_by_date.csv",
beverages_by_date = pd.read_csv("../../data/beverages_by_date.csv",
index_col=0)
# zum datum konvertiert
@ -34,7 +34,7 @@ daily = beverages_by_date.resample("8h").bfill()
print(daily.loc["2024-02-8":"2024-02-14"])
# übung mit zeiten
solar_df = pd.read_csv("../data/Balkonkraftwerk.csv",
solar_df = pd.read_csv("../../data/Balkonkraftwerk.csv",
index_col=0)
solar_df.index = pd.to_datetime(solar_df.index)
print(solar_df)

View File

@ -4,7 +4,7 @@ import matplotlib.pyplot as plt
import pandas as pd
from pandas.core.config_init import max_cols
energy_df = pd.read_csv("../data/germany_energy_mix_2019_2024.csv")
energy_df = pd.read_csv("../../data/germany_energy_mix_2019_2024.csv")
# 1) Neue spalte
energy_df["Year Quarter"] = energy_df["Year"].astype(str) + " " + energy_df['Quarter']
energy_df.drop(["Year", "Quarter"], axis=1, inplace=True)

View File

@ -19,7 +19,7 @@ fig = cities_df.plot(x="Stadt", y=["Population", "Flaeche"], backend="plotly")
fig.show()
energy_df = pd.read_csv("../data/germany_energy_mix_2019_2024.csv")
energy_df = pd.read_csv("../../data/germany_energy_mix_2019_2024.csv")
# 1) Neue spalte
energy_df["Year Quarter"] = energy_df["Year"].astype(str) + " " + energy_df['Quarter']
energy_df.drop(["Year", "Quarter"], axis=1, inplace=True)

View File

@ -0,0 +1,63 @@
# pip/conda install geopandas
import geopandas as gpd
import matplotlib.pyplot as plt
# Nachbar von Land X suchen
def find_neighbouring_countries(df, country_name):
target = df[df["NAME"] == country_name]
if len(target) == 0:
print(f"{country_name} not found")
return []
# sjoin
neighbours = []
for idx, country in df.iterrows():
if country["NAME"] == country_name:
continue
if target.geometry.iloc[0].touches(country.geometry):
neighbours.append(country["NAME"])
return neighbours
if __name__ == "__main__":
world = gpd.read_file("../../data/110m_cultural.zip", layer="ne_110m_admin_0_countries")
print(list(world.columns))
print(world.iloc[0, -1])
print(world.iloc[0]["NAME"])
print()
print(world.crs)
# EPSG:4326 (latitude/longitude)
# epsg:3587 Mercator
w2 = world.to_crs("epsg:3587")
print(w2.iloc[0, -1])
print(world[["NAME", "CONTINENT", "POP_EST", "GDP_MD", "geometry"]].head())
# wie groß ist jedes Land?
world_proj_cylinder = world.to_crs("epsg:3587") # zylinder
world_proj_cylinder["area_sq_km"] = world_proj_cylinder.geometry.area / 1_000_000
print(world_proj_cylinder[["NAME", "area_sq_km"]].head(10))
cities = gpd.read_file("../../data/110m_cultural.zip",
layer="ne_110m_populated_places")
print(cities[["NAME", "geometry"]].head())
print(len(cities))
fig, ax = plt.subplots()
world.plot(column="CONTINENT", legend=True, ax=ax)
cities.plot(ax=ax, color="red")
plt.show()
res = find_neighbouring_countries(world, "Germany")
print(res)

View File

@ -0,0 +1,92 @@
import geopandas as gpd
import matplotlib.pyplot as plt
import pandas as pd
world = gpd.read_file("../../data/110m_cultural.zip", layer="ne_110m_admin_0_countries")
cities = gpd.read_file("../../data/110m_cultural.zip",
layer="ne_110m_populated_places")
def find_neighbouring_countries(df, country_name):
target = df[df["NAME"] == country_name]
if len(target) == 0:
print(f"{country_name} not found")
return gpd.GeoDataFrame()
# sjoin spatial-join
neighbours = gpd.sjoin(target, df, how="inner", predicate="touches")
return neighbours[["NAME_right", "CONTINENT_right", "geometry"]]
def get_close_cities(df, city_name):
target = df[df["NAMEASCII"] == city_name]
if len(target) == 0:
print(f"{city_name} not found")
return gpd.GeoDataFrame()
# sjoin spatial-join
#combined = gpd.sjoin(target, df, how="inner")
#combined["distance"] = combined.distance(target.iloc[0])
neighbours = gpd.sjoin_nearest(target, df, max_distance=10e12)
return neighbours
#intersects
#contains
#within
#touches
#crosses
#overlaps
german_neighbours = find_neighbouring_countries(world, "Germany")
print(german_neighbours[["NAME_right", "CONTINENT_right"]])
print(len(german_neighbours))
print("-"*100)
w2 = world.to_crs("EPSG:3857")
c2 = cities.to_crs("EPSG:3857")
print(list(c2.columns))
german_neighbours = get_close_cities(c2, "Paris")
print(len(german_neighbours))
print(german_neighbours)
w2.plot(column="CONTINENT", legend=True)
plt.show()
def get_cities_in_country(city_df, world_df, country):
country_match = world_df[world_df["NAME"] == country]
if len(country_match) == 0:
print(f"Returning empty for {country}")
return gpd.GeoDataFrame()
cities = gpd.sjoin(country_match, city_df,
how="inner",
predicate="contains")
cities = cities[["NAME_left", "NAMEASCII"]]
return cities
def get_cities_in_countries(city_df, world_df, countries):
res = get_cities_in_country(city_df, world_df, countries[0])
for country in countries[1:]:
tmp = get_cities_in_country(city_df, world_df, country)
res = pd.concat((res, tmp))
return res
# 1) get_cities_in_country (sjoin) # NAME_right, NAME_left
# 2) get_cities_in_countries (concat)
us_cities = get_cities_in_country(cities, world, "United States of America")
print(us_cities)
french_cities = get_cities_in_country(cities, world, "Switzerland")
print(french_cities)
central_eurpean_cities = get_cities_in_countries(
cities, world,
["France", "Switzerland", "Narnia", "Germany", "Denmark"]
)
print(central_eurpean_cities)
# United States = 9
# France = 4
# Italy: 3
# Germany: 1

View File

@ -0,0 +1,49 @@
import geopandas as gpd
import pandas as pd
import matplotlib.pyplot as plt
from shapely.geometry import Point
capitals = {
'Paris': Point(2.3522, 48.8566),
'Berlin': Point(13.4050, 52.5200),
'Madrid': Point(-3.7038, 40.4168),
'Rome': Point(12.4964, 41.9028),
'London': Point(-0.1278, 51.5074)
}
capitals_gdf = gpd.GeoDataFrame(
[{'city': city, 'geometry': point} for city, point in capitals.items()],
crs='EPSG:4326'
)
capitals_proj = capitals_gdf.to_crs('EPSG:3857') # Web Mercator
print([{'city': city, 'geometry': point} for city, point in capitals.items()])
# Alle Distanzen zwischen Städten berechnen
print(capitals_proj)
print("Distances between cities")
for i, city1 in capitals_proj.iterrows():
for j, city2 in capitals_proj.iterrows():
if i >= j:
continue
dist_m = city1["geometry"].distance(city2["geometry"])
dist_km = dist_m / 1_000
print(f"Distance between {city1['city']} and {city2['city']} = {dist_km:.0f}km")
print()
# buffer
world = gpd.read_file(
"../../data/110m_cultural.zip",
layer="ne_110m_admin_0_countries")
switzerland = world[world['NAME'] == 'Switzerland']
if len(switzerland) > 0:
# Project to equal-area for accurate buffering
switzerland_proj = switzerland.to_crs('EPSG:3857')
buffer_200km = switzerland_proj.geometry.buffer(900_000) # 200km in meters
buffer_gdf = gpd.GeoDataFrame(geometry=buffer_200km.to_crs('EPSG:4326'))
# Find countries that intersect with this buffer
countries_near_switzerland = gpd.sjoin(world, buffer_gdf, how='inner', predicate='intersects')
nearby_countries = countries_near_switzerland['NAME'].tolist()
print(f"Countries within 200km of Switzerland: {nearby_countries}")

View File

@ -0,0 +1,6 @@
# init wird immer durchlaufen, wenn das modul importiert wird
from .T19_geopandas import find_neighbouring_countries
from .T05_Klassen_properties import Circle
from .T08_WorldCities import *

View File

@ -0,0 +1,39 @@
import unittest
import pandas as pd
from energy_data_project import get_larger_cities_north_of_city, get_cities_beyond_latitude
# main.py # python main.py
# pip install -e /pfad/zum/modul
# datei1.py
# test_datei1.py
# ein modul erstellen und darin datei1.py auslagern
# datei1.py
# test_datei1.py
class MyTestCase(unittest.TestCase):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.world_cities = pd.read_excel("data/worldcities.xlsx")
def setUp(self):
print("Loading data")
def tearDown(self):
print("After test")
def test_get_larger_cities_north_of_city(self):
self.assertEqual(True, False) # add assertion here
cities_north = get_larger_cities_north_of_city(self.world_cities, "Berlin")
assert cities_north.iloc[0, 0], f"Error {cities_north.iloc[0, 0]} "
def test_something(self):
self.assertEqual(True, False) # add assertion here
cities_north = get_larger_cities_north_of_city(self.world_cities, "Berlin")
assert cities_north.iloc[0, 0] == "Moscow", f"Error {cities_north.iloc[0, 0]} "
if __name__ == '__main__':
unittest.main()

View File

@ -0,0 +1,14 @@
import energy_data_project
import geopandas as gpd
def test_find_neighbouring_countries():
world = gpd.read_file("data/110m_cultural.zip",
layer="ne_110m_admin_0_countries")
german_neighbours = energy_data_project.find_neighbouring_countries(
world,
"Germany")
# mit pytest
assert 1 == 1, "Correct"
assert german_neighbours == ['France', 'Poland', 'Austria', 'Switzerland', 'Luxembourg', 'Belgium', 'Netherlands', 'Denmark', 'Czechia'], "Error german neighbours not found"