使用 city2graph、OSMnx 與 PyTorch Geometric 實作空間圖神經網路進行城市功能推論的程式碼實作
重點摘要
在本教學中,我們利用 city2graph 建立了一個端到端的空間圖學習流程。首先從 OpenStreetMap 收集真實的城市 POI 資料與街道網絡資訊,並搭配合成資料備援機制以確保流程的可靠性。接著我們設計空間特徵、建構多個鄰近圖族群,並比較不同圖建構策略如何呈現相同的城市環境。之後,我們建立同質與異質圖結構,將其轉換為 PyTorch Geometric 格式,並訓練 GraphSAGE 模型從空間結構預測 POI 類別。透過此流程,我們將地理空間資料處理、圖建構及基於 GNN 的城市功能推論整合成一個實作工作流程。安裝 city2graph 與
In this tutorial, we build an end-to-end spatial graph learning pipeline using city2graph. We start by collecting real urban POI data and street network information from OpenStreetMap, with a synthetic fallback to ensure the workflow remains reliable. We then engineer spatial features, construct multiple proximity graph families, and compare how different graph-building strategies represent the same urban environment. After that, we create both heterogeneous and homogeneous graph structures, convert them into PyTorch Geometric format, and train a GraphSAGE model to predict POI categories from spatial structure. Through this process, we integrate geospatial data processing, graph construction, and GNN-based urban function inference into a single practical workflow. Installing city2graph and Importing Geospatial and Graph Learning Libraries Copy CodeCopiedUse a different Browser!pip -q install "city2graph[cpu]" osmnx contextily scikit-learn 2>/dev/null import warnings, numpy as np, pandas as pd, geopandas as gpd warnings.filterwarnings("ignore") from shapely.geometry import Point import matplotlib.pyplot as plt import city2graph as c2g print("city2graph version:", getattr(c2g, "__version__", "unknown")) print("PyTorch / PyG available:", c2g.is_torch_available()) import torch import torch.nn.functional as F from torch_geometric.nn import SAGEConv, to_hetero from torch_geometric.utils import to_undirected from sklearn.preprocessing import StandardScaler from sklearn.neighbors import NearestNeighbors from sklearn.metrics import accuracy_score, f1_score from sklearn.decomposition import PCA SEED = 42 np.random.seed(SEED); torch.manual_seed(SEED) We begin by installing the required libraries and importing the geospatial, graph learning, and machine learning tools used throughout the tutorial. We verify that city2graph and PyTorch Geometric are available so the rest of the workflow can run properly. We also set a fixed random seed to make the graph construction, training split, and model results more reproducible. Collecting OpenStreetMap POI Data with a Synthetic Fallback Copy CodeCopiedUse a different BrowserCENTER = (35.6595, 139.7005) DIST_M = 1100 TAG_QUERIES = { "food": {"amenity": ["restaurant", "cafe", "fast_food", "bar", "pub"]}, "retail": {"shop": True}, "education": {"amenity": ["school", "university", "college", "kindergarten", "library"]}, "health": {"amenity": ["hospital", "clinic", "pharmacy", "doctors", "dentist"]}, } def to_points(gdf): g = gdf.copy() g["geometry"] = g.geometry.representative_point() return g poi_gdf, segments_gdf = None, None try: import osmnx as ox ox.settings.use_cache = True ox.settings.log_console = False frames = [] for label, tags in TAG_QUERIES.items(): try: f = ox.features_from_point(CENTER, tags=tags, dist=DIST_M) f = f[f.geometry.notna()] if len(f): f = to_points(f)[["geometry"]].copy() f["category"] = label frames.append(f) except Exception as e: print(f" (skip {label}: {e})") if not frames: raise RuntimeError("No POIs returned from Overpass.") poi_gdf = gpd.GeoDataFrame(pd.concat(frames, ignore_index=True), crs="EPSG:4326") G = ox.graph_from_point(CENTER, dist=DIST_M, network_type="walk") segments_gdf = ox.graph_to_gdfs(G, nodes=False, edges=True).reset_index(drop=True)[["geometry"]] print(f"OSM acquisition OK -> {len(poi_gdf)} POIs, {len(segments_gdf)} street segments") except Exception as e: print(f"OSM unavailable ({e}) -> generating synthetic clustered POIs.") rng = np.random.default_rng(SEED) cats = list(TAG_QUERIES.keys()) centers = rng.uniform(-0.01, 0.01, size=(8, 2)) + np.array(CENTER[::-1]) rows = [] for ci, c in enumerate(centers): dom = cats[ci % len(cats)] n = rng.integers(40, 90) pts = c + rng.normal(0, 0.0016, size=(n, 2)) for (lon, lat) in pts: cat = dom if rng.random() < 0.75 else rng.choice(cats) rows.append({"geometry": Point(lon, lat), "category": cat}) poi_gdf = gpd.GeoDataFrame(rows, crs="EPSG:4326") segments_gdf = None print(f"Synthetic dataset -> {len(poi_gdf)} POIs") if len(poi_gdf) > 700: poi_gdf = poi_gdf.sample(700, random_state=SEED).reset_index(drop=True) metric_crs = poi_gdf.estimate_utm_crs() poi_gdf = poi_gdf.to_crs(metric_crs).reset_index(drop=True) if segments_gdf is not None: segments_gdf = segments_gdf.to_crs(metric_crs) print("Class balance:\n", poi_gdf["category"].value_counts()) We collect real POI data from OpenStreetMap around Shibuya, Tokyo, and group the locations into broad urban function categories such as food, retail, education, and health. We also download the walkable street network so that the POIs can later be connected with urban-form features. If the OSM request fails, we generate a synthetic clustered dataset, which keeps the tutorial runnable even when online data access is unavailable. Engineering Spatial Features and Building Proximity Graph Families Copy CodeCopiedUse a different Browserpoi_gdf["cx"] = poi_gdf.geometry.x poi_gdf["cy"] = poi_gdf.geometry.y coords = poi_gdf[["cx", "cy"]].to_numpy() nn = NearestNeighbors(radius=150.0).fit(coords) poi_gdf["local_density"] = [len(idx) - 1 for idx in nn.radius_neighbors(coords, return_distance=False)] if segments_gdf is not None and len(segments_gdf): try: joined = gpd.sjoin_nearest(poi_gdf[["geometry"]], segments_gdf[["geometry"]], distance_col="dist_street") poi_gdf["dist_street"] = joined.groupby(level=0)["dist_street"].min().reindex(poi_gdf.index).fillna(0.0) except Exception: poi_gdf["dist_street"] = 0.0 else: poi_gdf["dist_street"] = 0.0 poi_gdf["category"] = poi_gdf["category"].astype("category") poi_gdf["label"] = poi_gdf["category"].cat.codes.astype(int) CLASS_NAMES = list(poi_gdf["category"].cat.categories) print("Classes:", CLASS_NAMES) def graph_stats(name, builder): try: nodes, edges = builder() deg = pd.Series(np.r_[edges.index.get_level_values(0), edges.index.get_level_values(1)]).value_counts() return name, len(edges), round(deg.mean(), 2), (nodes, edges) except Exception as e: return name, f"ERR: {e}", None, None builders = { "KNN (k=8)": lambda: c2g.knn_graph(poi_gdf, distance_metric="euclidean", k=8, as_nx=False), "Delaunay": lambda: c2g.delaunay_graph(poi_gdf, as_nx=False), "Gabriel": lambda: c2g.gabriel_graph(poi_gdf, as_nx=False), "RNG": lambda: c2g.relative_neighborhood_graph(poi_gdf, as_nx=False), "EMST": lambda: c2g.euclidean_minimum_spanning_tree(poi_gdf, as_nx=False), "Waxman": lambda: c2g.waxman_graph(poi_gdf, distance_metric="euclidean", r0=150, beta=0.6), } print("\n--- Proximity graph comparison ---") print(f"{'graph':<14}{'#edges':>10}{'avg_degree':>12}") built = {} for nm, b in builders.items(): name, ne, avgdeg, payload = graph_stats(nm, b) print(f"{name:<14}{str(ne):>10}{str(avgdeg):>12}") if payload: built[nm] = payload fig, axes = plt.subplots(1, 3, figsize=(16, 5)) for ax, key in zip(axes, ["KNN (k=8)", "Delaunay", "EMST"]): if key in built: n_, e_ = built[key] e_.plot(ax=ax, linewidth=0.4, color="#3b7dd8", alpha=0.6) poi_gdf.plot(ax=ax, markersize=4, color="#d83b5c") ax.set_title(key); ax.set_axis_off() plt.suptitle("Spatial graph topologies on the same POI set", y=1.02) plt.tight_layout(); plt.show() We engineer spatial features for each POI by extracting its projected coordinates, calculating local density, and estimating distance to the nearest street segment. We then assign category labels and build several families of proximity graphs, including KNN, Delaunay, Gabriel, RNG, EMST, and Waxman. We compare their edge counts and average degrees, then visualize selected graph topologies to see how differently they connect the same set of POIs. Constructing Heterogeneous and Homogeneous Graphs in PyTorch Geometric Copy CodeCopiedUse a different Browsernodes_dict = {} for cat in CLASS_NAMES: sub = poi_gdf[poi_gdf["category"] == cat].copy().reset_index(drop=True) nodes_dict[cat] = sub[["geometry", "cx", "cy", "local_density"]] try: _, bridge_edges = c2g.bridge_nodes(nodes_dict, proximity_method="knn", k=3, distance_metric="euclidean") hetero = c2g.gdf_to_pyg( nodes_dict, br
Related
相關文章
網易有道全面向AI轉型 全場景Agent矩陣亮相圖博會
{"id":"39ef5947-b77a-4904-bf03-ff6264f08dc4","object":"response","model":"deepseek-v4-flash","output":[],"stop_reason":"max_output_tokens","usage":{"input_tokens":154,"output_tokens":200,"total_tokens":354}}
MosaicLeaks: Can your research agent keep a secret?
Back to Articles MosaicLeaks: Can your research agent keep a secret? Enterprise Article Published June 18, 2026 Upvote - Alexander Gurung agurung Follow ServiceNow Rafael Pardinas rafapi-snow Follow ServiceNow TL;DR Deep research agents increasingly combine private local documents with external tools like web retrieval, creating a privacy risk: an agent's external queries may leak sensitive information. MosaicLeaks proposes a new deep-research task with multi-hop questions that interleave public and private information. Across the models we tested, agents frequently leaked private information, and training only for task performance made it worse. We propose a mosaic-leakage-aware RL training method, Privacy-Aware Deep Research (PA-DR), which raises strict chain success (the share of chains

騰訊老兵+大廠00後新銳,碼上飛想做的不只是AI Coding
這篇消息聚焦「騰訊老兵+大廠00後新銳,碼上飛想做的不只是AI Coding」。原始導語提到:已接入華為鴻蒙生態 從 AI 情報角度來看,這類內容值得關注其背後的技術進展、產品落地、產業競爭與後續市場影響。

Agent引爆網盤大戰,騰訊、百度、阿里齊聚,這次爭的不再是下載速度
這篇消息聚焦「Agent引爆網盤大戰,騰訊、百度、阿里齊聚,這次爭的不再是下載速度」。原始導語提到:網盤成了Agent新基建。 從 AI 情報角度來看,這類內容值得關注其背後的技術進展、產品落地、產業競爭與後續市場影響。

21年老牌企服公司的AI實驗:讓Agent跑一遍流程
這篇消息聚焦「21年老牌企服公司的AI實驗:讓Agent跑一遍流程」。原始導語提到:司盟企服接入騰訊雲WorkBuddy後,將海外郵件管理、審計理賬、訂單審核等高頻交付流程交給Agent先跑一遍 從 AI 情報角度來看,這類內容值得關注其背後的技術進展、產品落地、產業競爭與後續市場影響。
曹操出行宣佈啟動全面AI轉型,組織升級向AI原生公司邁進
曹操出行在2026國際汽車及供應鏈博覽會 上宣佈啟動全面AI轉型,併發布RoboX戰略,打造全球領先的物理AI移動科技平臺。與此同時,公司正式啟動組織升級,加快向AI原生公司邁進。為推動全面AI轉型,今年上半年,公司推進戰略聚焦,持續優化業務結構,主動收縮非核心業務,加快向AI原生公司轉型。