使用谷歌细化从谷歌映射API JSON提取postcode

时间:2022-08-22 10:19:44

I'm trying to use Google Refine to extract postcodes from Google Maps API JSON.

我尝试使用谷歌提炼从谷歌地图API JSON中提取邮政编码。

I added a new column by fetching URLs:

我通过获取url添加了一个新列:

"http://maps.googleapis.com/maps/api/geocode/json?sensor=false&address=" + escape(value, "url")

“http://maps.googleapis.com/maps/api/geocode/json?传感器=虚假地址= " +逃脱(价值,“url”)

Then the resulting JSON is as follows:

得到的JSON如下:

{ "results" : [ { "address_components" : [ { "long_name" : "44", "short_name" : "44", "types" : [ "street_number" ] }, { "long_name" : "Homer Street", "short_name" : "Homer St", "types" : [ "route" ] }, { "long_name" : "London", "short_name" : "London", "types" : [ "locality", "political" ] }, { "long_name" : "Greater London", "short_name" : "Gt Lon", "types" : [ "administrative_area_level_2", "political" ] }, { "long_name" : "United Kingdom", "short_name" : "GB", "types" : [ "country", "political" ] }, { "long_name" : "W1H 4NW", "short_name" : "W1H 4NW", "types" : [ "postal_code" ] }, { "long_name" : "London", "short_name" : "London", "types" : [ "postal_town" ] } ], "formatted_address" : "44 Homer Street, London, Greater London W1H 4NW, UK", "geometry" : { "location" : { "lat" : 51.51981750, "lng" : -0.16534040 }, "location_type" : "ROOFTOP", "viewport" : { "northeast" : { "lat" : 51.52116648029151, "lng" : -0.1639914197084980 }, "southwest" : { "lat" : 51.51846851970851, "lng" : -0.1666893802915020 } } }, "types" : [ "street_address" ] } ], "status" : "OK" }

After browsing through a few blogs to find the relevant code, I then tried transforming the column using this...

在浏览了几个博客找到相关的代码后,我尝试用这个来转换这个列……

value.parseJson().results[0]["formatted_address"]

...which works great for the full address.

…这对于完整的地址非常有用。

The problem occurs when I try to extract the postcode. I tried fiddling around and got nowhere, then I downloaded JSONPad and pasted the JSON into a tree map to get the path:

当我试图提取邮政编码时,问题就出现了。我试着到处摆弄,却一无所获,然后我下载了JSONPad,将JSON粘贴到一个树映射中,获得路径:

value.parseJson().results[0]["address_components"][5]["long_name"]

The problem is that this extracts the postcode perfectly for some entries, and not so perfectly for others, where it extracts something else - town or country, for example.

问题是,对于某些条目,它可以完美地提取邮政编码,而对于其他条目,就不那么完美了,比如,它可以提取其他东西——城镇或国家。

Changing the [5] to [6] seems to extract the postcodes for the other addresses, but is there a way to extract ONLY the postcode, regardless of where it falls in the structure?

将[5]更改为[6]似乎可以提取其他地址的邮编,但是否有一种方法可以只提取邮编,而不管邮编在结构中的什么位置?

Any help much appreciated!

任何帮助表示感谢!

2 个解决方案

#1


0  

What you'd probably have to do is loop over the structs in the address_components array, checking the "types" of each one. When types contains "postal_code", then tada, that's your postcode.

您可能需要做的是对address_components数组中的结构体进行循环,检查每个结构体的“类型”。当类型包含“postal_code”时,那么tada就是您的邮编。

Something like the following code (which worked for me):

类似以下代码(对我有用):

<script type="text/javascript">
var stuData = 
{
    "results": [{
            "address_components": [{
                    "long_name": "44",
                    "short_name": "44",
                    "types": ["street_number"]
                }, {
                    "long_name": "Homer Street",
                    "short_name": "Homer St",
                    "types": ["route"]
                }, {
                    "long_name": "London",
                    "short_name": "London",
                    "types": ["locality", "political"]
                }, {
                    "long_name": "Greater London",
                    "short_name": "Gt Lon",
                    "types": ["administrative_area_level_2", "political"]
                }, {
                    "long_name": "United Kingdom",
                    "short_name": "GB",
                    "types": ["country", "political"]
                }, {
                    "long_name": "W1H 4NW",
                    "short_name": "W1H 4NW",
                    "types": ["postal_code"]
                }, {
                    "long_name": "London",
                    "short_name": "London",
                    "types": ["postal_town"]
                }
            ],
            "formatted_address": "44 Homer Street, London, Greater London W1H 4NW, UK",
            "geometry": {
                "location": {
                    "lat": 51.51981750,
                    "lng": -0.16534040
                },
                "location_type": "ROOFTOP",
                "viewport": {
                    "northeast": {
                        "lat": 51.52116648029151,
                        "lng": -0.1639914197084980
                    },
                    "southwest": {
                        "lat": 51.51846851970851,
                        "lng": -0.1666893802915020
                    }
                }
            },
            "types": ["street_address"]
        }
    ],
    "status": "OK"
};

var myPostcode;

for (var i = 0; i < stuData.results[0].address_components.length; i++) {
    for (var j = 0; j < stuData.results[0].address_components[i].types.length; j++) {
        if (stuData.results[0].address_components[i].types[j] == "postal_code") {
            myPostcode = stuData.results[0].address_components[i].long_name;
            break;
        }
    }
}

console.log(myPostcode);
</script>

#2


0  

You could translate duncan's answer from Javascript to Python and use it with the Jython interpreter or you could use the following GREL expression (which I posted to Google Groups in response to your query):

您可以将duncan的答案从Javascript翻译到Python,并将其与Jython解释器一起使用,也可以使用下面的GREL表达式(我针对您的查询将其发布到谷歌组):

filter(forEach(value.parseJson().results[0].address_components,c,if(c.types[0]=='postal_code',c.long_name,'')),v,v!='')[0]

过滤器(forEach(value.parseJson().results[0].address_components,c,如果(c.types[0]= = postal_code,c.long_name,")),v,v ! = ")[0]

Of course, this all presumes that you're using this data for display on a Google Map, per the API's Terms of Service.

当然,所有这些都假设您正在使用这些数据在谷歌映射上显示,根据API的服务条款。

#1


0  

What you'd probably have to do is loop over the structs in the address_components array, checking the "types" of each one. When types contains "postal_code", then tada, that's your postcode.

您可能需要做的是对address_components数组中的结构体进行循环,检查每个结构体的“类型”。当类型包含“postal_code”时,那么tada就是您的邮编。

Something like the following code (which worked for me):

类似以下代码(对我有用):

<script type="text/javascript">
var stuData = 
{
    "results": [{
            "address_components": [{
                    "long_name": "44",
                    "short_name": "44",
                    "types": ["street_number"]
                }, {
                    "long_name": "Homer Street",
                    "short_name": "Homer St",
                    "types": ["route"]
                }, {
                    "long_name": "London",
                    "short_name": "London",
                    "types": ["locality", "political"]
                }, {
                    "long_name": "Greater London",
                    "short_name": "Gt Lon",
                    "types": ["administrative_area_level_2", "political"]
                }, {
                    "long_name": "United Kingdom",
                    "short_name": "GB",
                    "types": ["country", "political"]
                }, {
                    "long_name": "W1H 4NW",
                    "short_name": "W1H 4NW",
                    "types": ["postal_code"]
                }, {
                    "long_name": "London",
                    "short_name": "London",
                    "types": ["postal_town"]
                }
            ],
            "formatted_address": "44 Homer Street, London, Greater London W1H 4NW, UK",
            "geometry": {
                "location": {
                    "lat": 51.51981750,
                    "lng": -0.16534040
                },
                "location_type": "ROOFTOP",
                "viewport": {
                    "northeast": {
                        "lat": 51.52116648029151,
                        "lng": -0.1639914197084980
                    },
                    "southwest": {
                        "lat": 51.51846851970851,
                        "lng": -0.1666893802915020
                    }
                }
            },
            "types": ["street_address"]
        }
    ],
    "status": "OK"
};

var myPostcode;

for (var i = 0; i < stuData.results[0].address_components.length; i++) {
    for (var j = 0; j < stuData.results[0].address_components[i].types.length; j++) {
        if (stuData.results[0].address_components[i].types[j] == "postal_code") {
            myPostcode = stuData.results[0].address_components[i].long_name;
            break;
        }
    }
}

console.log(myPostcode);
</script>

#2


0  

You could translate duncan's answer from Javascript to Python and use it with the Jython interpreter or you could use the following GREL expression (which I posted to Google Groups in response to your query):

您可以将duncan的答案从Javascript翻译到Python,并将其与Jython解释器一起使用,也可以使用下面的GREL表达式(我针对您的查询将其发布到谷歌组):

filter(forEach(value.parseJson().results[0].address_components,c,if(c.types[0]=='postal_code',c.long_name,'')),v,v!='')[0]

过滤器(forEach(value.parseJson().results[0].address_components,c,如果(c.types[0]= = postal_code,c.long_name,")),v,v ! = ")[0]

Of course, this all presumes that you're using this data for display on a Google Map, per the API's Terms of Service.

当然,所有这些都假设您正在使用这些数据在谷歌映射上显示,根据API的服务条款。