How to Sort by Geo Distance Using Bounding Boxes in Algolia
Under the hood of an app I'm helping build, we utilize Algolia for search results. One of the features we support are to show items outside the user's search radius.
For example, let's say you were looking for an item to buy within 30 miles of you. But it so happens that the perfect item is listed 31 miles from you. In this case, we'll show you a module with items outside your search radius (e.g. 30-50 miles).
We'll define the problem concretely.
Let radius
be the radius from the user to the item.
Let min_radius
and max_radius
be the minimum/maximum radius to search within.
We need to search for items where min_radius < radius < max_radius
.
Limitations of Algolia Geo Filters
As of the time of writing, Algolia has no native way to solve such a problem.
The only way to get geo-sorting is by using their aroundRadius
parameter, but this param doesn't support a min_radius
param.
We also considered using Algolia's insideBoundingBox
parameter, which lets us specify multiple boxes to search within. We could support the area
where min_radius < radius < max_radius
by constructing 4 boxes around the inner bounding box.
However,
aroundLatLng
will be ignored if used along with this parameter.
This means we wouldn't be able to support our "sort by closest items first" use case. This is a non-starter.
We could've worked around this limitation by changing the client. But being that our business is centered around our mobile app, the client code is already deployed across many customers' devices.
We need to be able to make this change server-side to stay nimble.
Using Numeric Filters to Create Bounding Boxes
To support geo-sorting, Algolia expects documents to contain a
_geoloc
property specifying the coordinates.
For example, an item document would contain something like the following:
{
"_geoloc": {
"lat": 47.6062,
"lng": -122.3321
}
}
With this available, we're able to configure the index to allow _geoloc.lat
and _geoloc.lng
to be used as
numeric filters
to create our own filtering logic.
To sort by distance, we can combine aroundLatLng
and aroundRadius
.
The radius we define will surround the bounding boxes
we create using numeric filters on _geoloc.lat
and _geoloc.lng
.
Implementation Details
Let's recall the problem: We want to search for items where the item's radius
is between some min_radius
and max_radius
.
Here's how we can solve it:
- Set
aroundLatLng
to the user's location. - Set
aroundRadius
to"all"
. We need the radius to surround the bounding boxes we create. - Create bounding boxes by filtering on
_geoloc.lat
and_geoloc.lng
.
Let's see what this looks like in practice.
We'll show all items within a bounding box. Let's assume you've done some logic to compute the coordinates of the upper-left and bottom-right points of the bounding box.
index.search('query', {
aroundRadius: 'all',
aroundLatLng: '47.6062, -122.3321',
filters: [
`_geoloc.lat < ${upperLeft.lat}`,
`_geoloc.lat > ${bottomRight.lat}`,
`_geoloc.lng > ${upperLeft.lng}`,
`_geoloc.lng < ${bottomRight.lng}`,
].join(' AND '),
});
This is a start.
Now we want to filter out the items that are contained within an inner bounding box.
In other words, let's call the outer bounding box withinOuterBoundingBox
and the inner bounding box withinInnerBoundingBox
.
We want to search the items that satisfy
withinOuterBoundingBox && !withinInnerBoundingBox;
We already saw above how to search within the outer bounding box.
const withinOuterBoundingBox = [
`_geoloc.lat < ${upperLeft.lat}`,
`_geoloc.lat > ${bottomRight.lat}`,
`_geoloc.lng > ${upperLeft.lng}`,
`_geoloc.lng < ${bottomRight.lng}`,
].join(' AND ');
You might think we could negate a similar filter for the inner bounding box:
NOT (A AND B AND C AND D)
However, Algolia doesn't let us create a filter like this.
You can verify filter syntax with Algolia's filter syntax validator.
We can workaround this limitation by distributing the NOT
using De Morgan's Law.
The following is logically equivalent and valid filter syntax:
NOT A OR NOT B OR NOT C OR NOT D
We wouldn't use the NOT
syntax since we're dealing with numeric filters.
In order to negate an inequality, we need to flip the sign.
<
becomes>=
>
becomes<=
We can now piece together an expression to exclude the inner bounding box.
const notWithinInnerBoundingBox = [
`_geoloc.lat >= ${innerBoxUpperLeft.lat}`,
`_geoloc.lat <= ${innerBoxBottomRight.lat}`,
`_geoloc.lng <= ${innerBoxUpperLeft.lng}`,
`_geoloc.lng >= ${innerBoxBottomRight.lng}`,
].join(' OR ');
Putting this all together, our Algolia query becomes:
index.search('query', {
aroundRadius: 'all',
aroundLatLng: '47.6062, -122.3321',
filters: `${withinOuterBoundingBox} AND ${notWithinInnerBoundingBox}`,
});
And that's how we solved this problem. ๐
Bonus: Performance Considerations of Setting aroundRadius
to "all"
I was concerned that setting aroundRadius
to "all"
would
degrade performance.
We did some performance tests comparing
the results of setting aroundRadius
to "all"
versus the minimum radius that surrounds the outer bounding box.
It didn't make much of a difference.
To keep things simple, you're probably
fine setting aroundRadius
to "all"
when using _geoloc
to filter results.