You are here: GeekEstate Blog » Development » What we learned from hand training A.I. on 25k+ property descriptions

What we learned from hand training A.I. on 25k+ property descriptions

Hardcore artificial intelligence technology companies like and are using hand trained datasets to enhance their artificial intelligence offerings. This is a tedious and sweaty process that we’ve been tackling for real estate at Structurely, and our A.I has learned a lot already. This was not an easy task, as we all know, reading real estate property descriptions can be pretty gruesome…imagine trying to train A.I. on a sentence like,

“You’ll love the W/O bsmt w/ a LL family room w/frplc and a BR w/en-suite double vanity BA.”

Powerful natural language processing models, such as a deep neural network, can be trained to understand the fact that this sentence is talking about a “finished basement” with a tremendous amount of hand-tagged positive and negative examples. Our data science team has been doing this now for months and acquired more than 25k annotated examples. With this data they’ve built various models that are able to understand every detail of a property description automatically with extreme accuracy. We’ve learned that almost no 2 sentences are ever the same and rarely Realtors describe features and properties in the same way.

Exposing our hand trained models to other tech vendors

So what’s the point of all this tremendous effort and more importantly how does it actually impact real estate? Well, imagine if your IDX property search was now powered by natural language instead of using simple basic searches on # of beds, baths etc. As you know from the recent success of companies like RealScout, your customers will love you.

By exposing these powerful models to IDX search vendors and other real estate technology companies everyone can begin to benefit from our tedious hand-tagging on obscurely described listings. So you can now automatically build searches that recommend properties that have “main level laundry rooms” and “updated flooring”.

But this goes beyond just searching for property. With this now enhanced and clean listing data, comparables, valuations and other market stats start to become more and more efficient. We can start to further machine learning efforts by feeding models this powerful data and start to see new correlations between listings, such as associating a “new roof” with a $10k+/- in the property valuation. This starts to help glean new insights into markets and in turn make the relationship between agent and consumers more transparent – we’ve come a long way since MLS books.

Amazon, Google and Facebook are all investing in natural language

Amazon just bought, Google just bought and Facebook recently bought Every one of these companies does one thing and one thing very well – enable developers to build natural language apps. It’s pretty easy to see why these behemoth companies bought them up – consumers want to start searching using natural language.

By strategically buying NLP companies, these big players are leveraging crowdsourcing to continually annotate and build their now internal technology’s ability to read and write using natural language. Just take a look at Google’s newest Allo or Facebook’s exploding chatbot ecosystem.

“Conversational User Interfaces” are becoming the new way we interact with apps. Some people may argue that ‘chatbots’ will entirely replace websites or apps, however that’s never going to be the case. Instead, a better generation of apps will soon hit the market, one’s that are powered by natural language.

Natural language isn’t only consumer facing

Home buyers aren’t the only one who can benefit from searching for,

“a 2 bed home in San Antonio with an updated roof and kitchen”

Real estate agents and brokers can also leverage this technology. When trained properly using highly annotated and complete data, CRMs can start to pick up on patterns in messages and help score and predict if and when a lead will close based on what they are saying.

“I’d like to put an offer in on 123 Main St. this week”

This sentence is a pretty telling sign a lead is close to closing, now imagine if a CRM was continuously monitoring your emails and messages to help you segment and tag leads automatically to make agents and brokers more efficient.

Or if we look at the earlier example of powering an IDX search using natural language, imagine sending highly personalized emails automatically written using natural language. Instead of sending horrendous drip campaign emails that bombard your prospects with incorrect information, simple emails with highly specific content are more effective:

“I saw you liked 123 Main St. with an updated kitchen and tile backsplash. Here are a couple more properties in San Antonio with updated kitchens.”

The real estate industry desperately needs to look around at the growing natural language technology landscape. When companies like Facebook, Amazon and Google are making heavy investments in this space it’s apparent it will start to penetrate how we interact with technology to buy and sell homes. Conversational interfaces are coming to real estate, but it’s not going to come easily, no thanks in part to the subpar data we rely on to search for homes. But we are dedicated to cleaning and learning from this data and exposing it to technology partners like you 🙂

About Nathan Joens

Nate is the co-founder of Structurely, building artificial intelligence for real estate to help personalize interactions buyers and sellers have through messaging. Nate works with a team of 3 data scientists specializing in artificial intelligence technology who love to solve the tough problems they are faced with in real estate data.

This entry was posted in Development and tagged , , . Bookmark the permalink.
  • Sounds exciting Nathan. I signed up on your site to be alerted when you have something we can put on our website.

  • I wonder — maybe the future of search is predicting the homes they want — without the buyer “searching” for anything?

    • Nate Joens

      This is a definite possibility and doable probably even as of right now. So long as a user would authenticate a search site to view their Facebook posts, likes and profile, there is a good deal of insights that could be passed into a search to automatically generate a set of predicted properties that user may like. Its probably kind of a stretch to associate a user’s FB behavior to their interests in a property however, but it’d be interesting to see how the results may play out

    • We currently track every property a user views. Then we have a page that gives us a summary so we can focus in on what they are most interested in. For example, someone might view a luxury listing because they are just curious and like the photos, but we see from their history most of their views are Condos in Waikiki in a certain price range.

      The summary is the key to making sense of the data. We drop listings they only viewed once, like that luxury listing, and focus where most of their views are.

      We than use this data to set them up on a Watch List so they keep getting properties that interest them.

      Of course, we have to see what they search before we can create this list, so that is not really as advanced as what you are thinking.

      We do have Facebook data that could be tied into search, because we advertise our listings, and people who like or comment positively probably would like to see more homes like that.

  • Nathan – I really enjoyed that article. Exciting possibilities ahead to make up for all the long days at the keyboard. I’ll be following Structurely keenly. 🙂

2008 - 2018 GEEK ESTATE · ALL RIGHTS RESERVED - THEME BY Virtual Results
Hosted by Caffeine Interactive