moved to module

2025-07-17 15:27:28 +02:00 · 2025-07-17 15:27:28 +02:00 · 80c86db42b
commit 80c86db42b
parent 4e941467eb
31 changed files with 362 additions and 127 deletions
--- a/data/README.md
+++ b/data/README.md
@ -0,0 +1,76 @@
 Get the full scoop at [NaturalEarthData.com](http://naturalearthdata.com)
 _No, really! This readme is a poor substitute for the live site._
 # About Natural Earth Vector
 Natural Earth is a public domain map dataset available at 1:10m, 1:50m, and 1:110 million scales. Featuring tightly integrated vector (here) and raster data ([over there](https://github.com/nvkelso/natural-earth-raster)), with Natural Earth you can make a variety of visually pleasing, well-crafted maps with cartography or GIS software.
 Natural Earth was built through a collaboration of many [volunteers](http://www.naturalearthdata.com/about/contributors/) and is supported by [NACIS](http://www.nacis.org/) (North American Cartographic Information Society), and is free for use in any type of project (see our [Terms of Use](http://www.naturalearthdata.com/about/terms-of-use/) page for more information).
 [Get the Data »](http://www.naturalearthdata.com/downloads)
 ![Convenience](http://www.naturalearthdata.com/wp-content/uploads/2009/08/home_image_11.png)
 # Convenience
 Natural Earth solves a problem: finding suitable data for making small-scale maps. In a time when the web is awash in geospatial data, cartographers are forced to waste time sifting through confusing tangles of poorly attributed data to make clean, legible maps. Because your time is valuable, Natural Earth data comes ready-to-use.
 ![Neatness Counts](http://www.naturalearthdata.com/wp-content/uploads/2009/08/home_image_21.png)
 # Neatness Counts
 The carefully generalized linework maintains consistent, recognizable geographic shapes at 1:10m, 1:50m, and 1:110m scales. Natural Earth was built from the ground up so you will find that all data layers align precisely with one another. For example, where rivers and country borders are one and the same, the lines are coincident.
 ![GIS Atributes](http://www.naturalearthdata.com/wp-content/uploads/2009/08/home_image_32.png)
 # GIS Attributes
 Natural Earth, however, is more than just a collection of pretty lines. The data attributes are equally important for mapmaking. Most data contain embedded feature names, which are ranked by relative importance. Other attributes facilitate faster map production, such as width attributes assigned to river segments for creating tapers.
 # Versioning
 The 2.0 release in 2012 marked the project's shift from so-called marketing versions to [semantic versioning](http://semver.org/). 
 Natural Earth is a big project with hundreds of files that depend on each other and the total weighs in at several gigabytes. SemVer is a simple set of rules and requirements around version numbers. For our project, the data layout is the API. 
 * **Version format of X.Y.Z** (Major.Minor.Patch). 
 * **Backwards incompatible** changes, increment the major version X.
 * **Backwards compatible** additions/changes, increment the minor version Y
 * **Bug fixes** not affecting the file and field names, patch version Z will be incremented. 
 Major version increments:
 * Changing existing data **file names**
 * Changing existing data **column (field) names**
 * Removing **`FeatureCla` field attribute values**
 * Additions, deletions to **admin-0**
 * Introduce **significant new themes**
 Minor version increments:
 * Any shape or attribute change in **admin-0**
 * Additions, deletions, and any shape or attribute changes in **admin-1**
 * Additions, deletions to **any theme**
 * Major shape or attribute changes in **any theme**
 * Adding, changing **`FeatureCla` field attribute values**
 * Introduce **minor new themes**
 Patch version increments:
 * Minor shape or attribute changes in **any theme**
 * Bug fixes to shape, attributes in **any theme**
 Under this scheme, version numbers and the way they change convey meaning about the underlying code and what has been modified from one version to the next.
 When we introduce a new version of Natural Earth, you can tell by the version number how much effort you will need to extend to integrate the data with your map implementation.
 * **Bug fixes Z**: can simply use the new data files, replacing your old files.
 * **Minor version Y**: limited integration challenges.
 * **Major version X**: significatnt integration challenges, either around changed file strucutre, field layout, field values like `FeatureCla` used in symbolizing data, or significant new additions or significant changes to existing themes.
 # &etc
 Natural Earth is maintained by Nathaneiel V. KELSO ([@nvkelso](https://github.com/nvkelso/)) and Tom Patterson.
 The project transitioned to Github in 2012. Versioned files are here to collaborate around. The frontend still lives at [NaturalEarthData.com](http://naturalearthdata.com).
--- a/pyproject.toml
+++ b/pyproject.toml
@ -0,0 +1,13 @@
 [build-system]
 requires = ["setuptools>=61.0"]
 build-backend = "setuptools.build_meta"
 [project]
 name="energy_data_project"
 version="0.1.0"
 requires-python = ">3.10"
 dependencies=[
 ]
 [tool.setuptools.packages.find]
 where = ["src"]
--- a/src/T19_geopandas.py
+++ b/src/T19_geopandas.py
@ -1,62 +0,0 @@
 # pip/conda install geopandas
 import geopandas as gpd
 import matplotlib.pyplot as plt
 world = gpd.read_file("../data/110m_cultural.zip", layer="ne_110m_admin_0_countries")
 print(list(world.columns))
 print(world.iloc[0, -1])
 print(world.iloc[0]["NAME"])
 print()
 print(world.crs)
 # EPSG:4326  (latitude/longitude)
 # epsg:3587  Mercator
 w2 = world.to_crs("epsg:3587")
 print(w2.iloc[0, -1])
 print(world[["NAME", "CONTINENT", "POP_EST", "GDP_MD", "geometry"]].head())
 # wie groß ist jedes Land?
 world_proj_cylinder = world.to_crs("epsg:3587")  # zylinder
 world_proj_cylinder["area_sq_km"] = world_proj_cylinder.geometry.area / 1_000_000
 print(world_proj_cylinder[["NAME", "area_sq_km"]].head(10))
 cities = gpd.read_file("../data/110m_cultural.zip",
                      layer="ne_110m_populated_places")
 print(cities[["NAME", "geometry"]].head())
 print(len(cities))
 fig, ax = plt.subplots()
 world.plot(column="CONTINENT", legend=True, ax=ax)
 cities.plot(ax=ax, color="red")
 plt.show()
 # Nachbar von Land X suchen
 def find_neighbouring_countries(df, country_name):
    target = df[df["NAME"] == country_name]
    if len(target) == 0:
        print(f"{country_name} not found")
        return []
    # sjoin
    neighbours = []
    for idx, country in df.iterrows():
        if country["NAME"] == country_name:
            continue
        if target.geometry.iloc[0].touches(country.geometry):
            neighbours.append(country["NAME"])
    return neighbours
 res = find_neighbouring_countries(world, "Spain")
 print(res)
--- a/src/T20_Geopandas_joining.py
+++ b/src/T20_Geopandas_joining.py
@ -1,55 +0,0 @@
 import geopandas as gpd
 import matplotlib.pyplot as plt
 world = gpd.read_file("../data/110m_cultural.zip", layer="ne_110m_admin_0_countries")
 cities = gpd.read_file("../data/110m_cultural.zip",
                      layer="ne_110m_populated_places")
 def find_neighbouring_countries(df, country_name):
    target = df[df["NAME"] == country_name]
    if len(target) == 0:
        print(f"{country_name} not found")
        return gpd.GeoDataFrame()
    # sjoin  spatial-join
    neighbours = gpd.sjoin(target, df, how="inner", predicate="touches")
    return neighbours[["NAME_right", "CONTINENT_right", "geometry"]]
 def get_close_cities(df, city_name):
    target = df[df["NAMEASCII"] == city_name]
    if len(target) == 0:
        print(f"{city_name} not found")
        return gpd.GeoDataFrame()
    # sjoin  spatial-join
    neighbours = gpd.sjoin_nearest(target, df, max_distance=10e12)
    return neighbours
 #intersects
 #contains
 #within
 #touches
 #crosses
 #overlaps
 german_neighbours = find_neighbouring_countries(world, "Germany")
 print(german_neighbours[["NAME_right", "CONTINENT_right"]])
 print(len(german_neighbours))
 print("-"*100)
 w2 = world.to_crs("EPSG:3857")
 c2 = cities.to_crs("EPSG:3857")
 print(list(c2.columns))
 german_neighbours = get_close_cities(c2, "Paris")
 print(len(german_neighbours))
 print(german_neighbours)
 w2.plot(column="CONTINENT", legend=True)
 plt.show()
 # 1) get_cities_in_country    (sjoin)
 # 2) get_cities_in_countries  (concat)
--- a/src/init.py
+++ b/src/init.py
--- a/src/energy_data_project/EXP_01_cities.py
+++ b/src/energy_data_project/EXP_01_cities.py
--- a/src/energy_data_project/T00_Intro.py
+++ b/src/energy_data_project/T00_Intro.py
--- a/src/energy_data_project/T01_Funktionen.py
+++ b/src/energy_data_project/T01_Funktionen.py
--- a/src/energy_data_project/T02_Datentypen.py
+++ b/src/energy_data_project/T02_Datentypen.py
--- a/src/energy_data_project/T03_Klassen.py
+++ b/src/energy_data_project/T03_Klassen.py
--- a/src/energy_data_project/T04_Klassen_privat.py
+++ b/src/energy_data_project/T04_Klassen_privat.py
--- a/src/energy_data_project/T05_Klassen_properties.py
+++ b/src/energy_data_project/T05_Klassen_properties.py
--- a/src/energy_data_project/T06_Klassen_dataclasses.py
+++ b/src/energy_data_project/T06_Klassen_dataclasses.py
--- a/src/energy_data_project/T07_pandas.py
+++ b/src/energy_data_project/T07_pandas.py
--- a/src/energy_data_project/T08_WorldCities.py
+++ b/src/energy_data_project/T08_WorldCities.py
@ -54,7 +54,7 @@ def get_larger_cities_north_of_city(df, city: str, country: Optional[str]=None):
 if __name__ == "__main__":
    print("Country Population")
-    world_cities = pd.read_excel("../data/worldcities.xlsx")
+    world_cities = pd.read_excel("../../data/worldcities.xlsx")
    print(world_cities.columns)
    german_population = get_population_of_country(world_cities, "Germany")
    print(f"Population of Germany: {german_population}")
--- a/src/energy_data_project/T09_GroupBy.py
+++ b/src/energy_data_project/T09_GroupBy.py
@ -1,6 +1,6 @@
 import pandas as pd
-beverages = pd.read_csv("../data/beverages.csv")#
+beverages = pd.read_csv("../../data/beverages.csv")#
 print(beverages)
@ -24,7 +24,7 @@ for name, info in groups:
 print()
 # print(help(pd.read_csv))
-donations_df = pd.read_csv("../data/donations.csv")
+donations_df = pd.read_csv("../../data/donations.csv")
 print(donations_df)
 subset = donations_df[["city", "job", "income", "donations"]]
--- a/src/energy_data_project/T10_ex_Energy.py
+++ b/src/energy_data_project/T10_ex_Energy.py
@ -1,6 +1,6 @@
 import pandas as pd
-energy_df = pd.read_csv("../data/germany_energy_mix_2019_2024.csv")
+energy_df = pd.read_csv("../../data/germany_energy_mix_2019_2024.csv")
 print(energy_df)
 print(energy_df.columns)
--- a/src/energy_data_project/T11_Pivotieren.py
+++ b/src/energy_data_project/T11_Pivotieren.py
@ -24,7 +24,7 @@ pivoted_df = df.pivot(index="Product",
 print(pivoted_df)
-beverages = pd.read_csv("../data/beverages.csv")
+beverages = pd.read_csv("../../data/beverages.csv")
 beverages["Day"] = (["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"] * 35)[:103]
 print(beverages)
@ -50,7 +50,7 @@ print(coffees)
 # df.index <- name der index-spalte
 print("\n"*3)
-energy_df = pd.read_csv("../data/germany_energy_mix_2019_2024.csv")
+energy_df = pd.read_csv("../../data/germany_energy_mix_2019_2024.csv")
 # 1) Neue spalte
 energy_df["Year Quarter"] = energy_df["Year"].astype(str) + " " + energy_df['Quarter']
 # drop: axis=0 die zeilen  namens [Year, quarter] gelöscht
--- a/src/energy_data_project/T12_Datetimes.py
+++ b/src/energy_data_project/T12_Datetimes.py
@ -3,7 +3,7 @@ import pandas as pd
 # jahr-monat-tag
 # monat/tag/jahr  (us-schreibweise)
-beverages_by_date = pd.read_csv("../data/beverages_by_date.csv",
+beverages_by_date = pd.read_csv("../../data/beverages_by_date.csv",
                                index_col=0)
 # zum datum konvertiert
@ -34,7 +34,7 @@ daily = beverages_by_date.resample("8h").bfill()
 print(daily.loc["2024-02-8":"2024-02-14"])
 # übung mit zeiten
-solar_df = pd.read_csv("../data/Balkonkraftwerk.csv",
+solar_df = pd.read_csv("../../data/Balkonkraftwerk.csv",
                       index_col=0)
 solar_df.index = pd.to_datetime(solar_df.index)
 print(solar_df)
--- a/src/energy_data_project/T13_GroupbyExtra.py
+++ b/src/energy_data_project/T13_GroupbyExtra.py
--- a/src/energy_data_project/T14_Plotten.py
+++ b/src/energy_data_project/T14_Plotten.py
--- a/src/energy_data_project/T15_EnergyPlots.py
+++ b/src/energy_data_project/T15_EnergyPlots.py
@ -4,7 +4,7 @@ import matplotlib.pyplot as plt
 import pandas as pd
 from pandas.core.config_init import max_cols
-energy_df = pd.read_csv("../data/germany_energy_mix_2019_2024.csv")
+energy_df = pd.read_csv("../../data/germany_energy_mix_2019_2024.csv")
 # 1) Neue spalte
 energy_df["Year Quarter"] = energy_df["Year"].astype(str) + " " + energy_df['Quarter']
 energy_df.drop(["Year", "Quarter"], axis=1, inplace=True)
--- a/src/energy_data_project/T16_plotly_expls.py
+++ b/src/energy_data_project/T16_plotly_expls.py
@ -19,7 +19,7 @@ fig = cities_df.plot(x="Stadt", y=["Population", "Flaeche"], backend="plotly")
 fig.show()
-energy_df = pd.read_csv("../data/germany_energy_mix_2019_2024.csv")
+energy_df = pd.read_csv("../../data/germany_energy_mix_2019_2024.csv")
 # 1) Neue spalte
 energy_df["Year Quarter"] = energy_df["Year"].astype(str) + " " + energy_df['Quarter']
 energy_df.drop(["Year", "Quarter"], axis=1, inplace=True)
--- a/src/energy_data_project/T17_merging_of_frames.py
+++ b/src/energy_data_project/T17_merging_of_frames.py
--- a/src/energy_data_project/T18_MergingEx.py
+++ b/src/energy_data_project/T18_MergingEx.py
--- a/src/energy_data_project/T19_geopandas.py
+++ b/src/energy_data_project/T19_geopandas.py
@ -0,0 +1,63 @@
 # pip/conda install geopandas
 import geopandas as gpd
 import matplotlib.pyplot as plt
 # Nachbar von Land X suchen
 def find_neighbouring_countries(df, country_name):
    target = df[df["NAME"] == country_name]
    if len(target) == 0:
        print(f"{country_name} not found")
        return []
    # sjoin
    neighbours = []
    for idx, country in df.iterrows():
        if country["NAME"] == country_name:
            continue
        if target.geometry.iloc[0].touches(country.geometry):
            neighbours.append(country["NAME"])
    return neighbours
 if __name__ == "__main__":
    world = gpd.read_file("../../data/110m_cultural.zip", layer="ne_110m_admin_0_countries")
    print(list(world.columns))
    print(world.iloc[0, -1])
    print(world.iloc[0]["NAME"])
    print()
    print(world.crs)
    # EPSG:4326  (latitude/longitude)
    # epsg:3587  Mercator
    w2 = world.to_crs("epsg:3587")
    print(w2.iloc[0, -1])
    print(world[["NAME", "CONTINENT", "POP_EST", "GDP_MD", "geometry"]].head())
    # wie groß ist jedes Land?
    world_proj_cylinder = world.to_crs("epsg:3587")  # zylinder
    world_proj_cylinder["area_sq_km"] = world_proj_cylinder.geometry.area / 1_000_000
    print(world_proj_cylinder[["NAME", "area_sq_km"]].head(10))
    cities = gpd.read_file("../../data/110m_cultural.zip",
                           layer="ne_110m_populated_places")
    print(cities[["NAME", "geometry"]].head())
    print(len(cities))
    fig, ax = plt.subplots()
    world.plot(column="CONTINENT", legend=True, ax=ax)
    cities.plot(ax=ax, color="red")
    plt.show()
    res = find_neighbouring_countries(world, "Germany")
    print(res)
--- a/src/energy_data_project/T20_Geopandas_joining.py
+++ b/src/energy_data_project/T20_Geopandas_joining.py
@ -0,0 +1,92 @@
 import geopandas as gpd
 import matplotlib.pyplot as plt
 import pandas as pd
 world = gpd.read_file("../../data/110m_cultural.zip", layer="ne_110m_admin_0_countries")
 cities = gpd.read_file("../../data/110m_cultural.zip",
                       layer="ne_110m_populated_places")
 def find_neighbouring_countries(df, country_name):
    target = df[df["NAME"] == country_name]
    if len(target) == 0:
        print(f"{country_name} not found")
        return gpd.GeoDataFrame()
    # sjoin  spatial-join
    neighbours = gpd.sjoin(target, df, how="inner", predicate="touches")
    return neighbours[["NAME_right", "CONTINENT_right", "geometry"]]
 def get_close_cities(df, city_name):
    target = df[df["NAMEASCII"] == city_name]
    if len(target) == 0:
        print(f"{city_name} not found")
        return gpd.GeoDataFrame()
    # sjoin  spatial-join
    #combined = gpd.sjoin(target, df, how="inner")
    #combined["distance"] = combined.distance(target.iloc[0])
    neighbours = gpd.sjoin_nearest(target, df, max_distance=10e12)
    return neighbours
 #intersects
 #contains
 #within
 #touches
 #crosses
 #overlaps
 german_neighbours = find_neighbouring_countries(world, "Germany")
 print(german_neighbours[["NAME_right", "CONTINENT_right"]])
 print(len(german_neighbours))
 print("-"*100)
 w2 = world.to_crs("EPSG:3857")
 c2 = cities.to_crs("EPSG:3857")
 print(list(c2.columns))
 german_neighbours = get_close_cities(c2, "Paris")
 print(len(german_neighbours))
 print(german_neighbours)
 w2.plot(column="CONTINENT", legend=True)
 plt.show()
 def get_cities_in_country(city_df, world_df, country):
    country_match = world_df[world_df["NAME"] == country]
    if len(country_match) == 0:
        print(f"Returning empty for {country}")
        return gpd.GeoDataFrame()
    cities = gpd.sjoin(country_match, city_df,
                       how="inner",
                       predicate="contains")
    cities = cities[["NAME_left", "NAMEASCII"]]
    return cities
 def get_cities_in_countries(city_df, world_df, countries):
    res = get_cities_in_country(city_df, world_df, countries[0])
    for country in countries[1:]:
        tmp = get_cities_in_country(city_df, world_df, country)
        res = pd.concat((res, tmp))
    return res
 # 1) get_cities_in_country    (sjoin)  # NAME_right, NAME_left
 # 2) get_cities_in_countries  (concat)
 us_cities = get_cities_in_country(cities, world, "United States of America")
 print(us_cities)
 french_cities = get_cities_in_country(cities, world, "Switzerland")
 print(french_cities)
 central_eurpean_cities = get_cities_in_countries(
    cities, world,
    ["France", "Switzerland", "Narnia", "Germany", "Denmark"]
 )
 print(central_eurpean_cities)
 # United States = 9
 # France = 4
 # Italy: 3
 # Germany: 1
--- a/src/energy_data_project/T21_OwnPoints.py
+++ b/src/energy_data_project/T21_OwnPoints.py
@ -0,0 +1,49 @@
 import geopandas as gpd
 import pandas as pd
 import matplotlib.pyplot as plt
 from shapely.geometry import Point
 capitals = {
    'Paris': Point(2.3522, 48.8566),
    'Berlin': Point(13.4050, 52.5200),
    'Madrid': Point(-3.7038, 40.4168),
    'Rome': Point(12.4964, 41.9028),
    'London': Point(-0.1278, 51.5074)
 }
 capitals_gdf = gpd.GeoDataFrame(
    [{'city': city, 'geometry': point} for city, point in capitals.items()],
    crs='EPSG:4326'
 )
 capitals_proj = capitals_gdf.to_crs('EPSG:3857')  # Web Mercator
 print([{'city': city, 'geometry': point} for city, point in capitals.items()])
 # Alle Distanzen zwischen Städten berechnen
 print(capitals_proj)
 print("Distances between cities")
 for i, city1 in capitals_proj.iterrows():
    for j, city2 in capitals_proj.iterrows():
        if i >= j:
            continue
        dist_m = city1["geometry"].distance(city2["geometry"])
        dist_km = dist_m / 1_000
        print(f"Distance between {city1['city']} and {city2['city']} = {dist_km:.0f}km")
 print()
 # buffer
 world = gpd.read_file(
    "../../data/110m_cultural.zip",
    layer="ne_110m_admin_0_countries")
 switzerland = world[world['NAME'] == 'Switzerland']
 if len(switzerland) > 0:
    # Project to equal-area for accurate buffering
    switzerland_proj = switzerland.to_crs('EPSG:3857')
    buffer_200km = switzerland_proj.geometry.buffer(900_000)  # 200km in meters
    buffer_gdf = gpd.GeoDataFrame(geometry=buffer_200km.to_crs('EPSG:4326'))
    # Find countries that intersect with this buffer
    countries_near_switzerland = gpd.sjoin(world, buffer_gdf, how='inner', predicate='intersects')
    nearby_countries = countries_near_switzerland['NAME'].tolist()
    print(f"Countries within 200km of Switzerland: {nearby_countries}")
--- a/src/energy_data_project/init.py
+++ b/src/energy_data_project/init.py
@ -0,0 +1,6 @@
 # init wird immer durchlaufen, wenn das modul importiert wird
 from .T19_geopandas import find_neighbouring_countries
 from .T05_Klassen_properties import Circle
 from .T08_WorldCities import *
--- a/tests/test_T08_WorldCities.py
+++ b/tests/test_T08_WorldCities.py
@ -0,0 +1,39 @@
 import unittest
 import pandas as pd
 from energy_data_project import get_larger_cities_north_of_city, get_cities_beyond_latitude
 # main.py  # python main.py
 # pip install -e /pfad/zum/modul
 # datei1.py
 # test_datei1.py
 # ein modul erstellen und darin datei1.py auslagern
 # datei1.py
 # test_datei1.py
 class MyTestCase(unittest.TestCase):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.world_cities = pd.read_excel("data/worldcities.xlsx")
    def setUp(self):
        print("Loading data")
    def tearDown(self):
        print("After test")
    def test_get_larger_cities_north_of_city(self):
        self.assertEqual(True, False)  # add assertion here
        cities_north = get_larger_cities_north_of_city(self.world_cities, "Berlin")
        assert cities_north.iloc[0, 0], f"Error {cities_north.iloc[0, 0]} "
    def test_something(self):
        self.assertEqual(True, False)  # add assertion here
        cities_north = get_larger_cities_north_of_city(self.world_cities, "Berlin")
        assert cities_north.iloc[0, 0] == "Moscow", f"Error {cities_north.iloc[0, 0]} "
 if __name__ == '__main__':
    unittest.main()
--- a/tests/test_T19_geopandas.py
+++ b/tests/test_T19_geopandas.py
@ -0,0 +1,14 @@
 import energy_data_project
 import geopandas as gpd
 def test_find_neighbouring_countries():
    world = gpd.read_file("data/110m_cultural.zip",
                          layer="ne_110m_admin_0_countries")
    german_neighbours = energy_data_project.find_neighbouring_countries(
        world,
        "Germany")
    # mit pytest
    assert 1 == 1, "Correct"
    assert german_neighbours == ['France', 'Poland', 'Austria', 'Switzerland', 'Luxembourg', 'Belgium', 'Netherlands', 'Denmark', 'Czechia'], "Error german neighbours not found"