US-12620040-B2 - Valuation of homes using geographic regions of varying granularity

US12620040B2US 12620040 B2US12620040 B2US 12620040B2US-12620040-B2

Abstract

A facility for estimating a subject home's value is described. For each of one or more direct home attributes, the facility determines the value of the direct home attribute for the subject home. For each of a plurality of arbitrary geographic regions of different sizes containing the subject home, the facility determines information relating to the geographic region as a whole. The facility then subjects the determined values and information to a statistical home valuation model to obtain an estimated value of the subject home.

Inventors

Nima Shahbazi
Mohamed Chahhou
Jordan Meyer
Shize Su

Assignees

MFTB Holdco, Inc.

Dates

Publication Date: 20260505
Application Date: 20231228

Claims (20)

1 . A method in a computing system for using one or more machine learning models to estimate a value of a subject home, comprising: automatically generating, for each region corresponding to each of a plurality of region sizes and based on a latitude/longitude pair associated with the subject home, a single computer-readable geohash encoding value, wherein the single computer-readable geohash encoding value of a first region size is generated by discarding a least significant digit of the geohash encoding value of a second region size, the first region size being at a next-less-granular level than the second region size; creating a training set comprising data of sale transactions of a plurality of homes selected from homes in two or more region sizes; periodically training the one or more machine learning models using the created training set, wherein the one or more machine learning models comprises at least one of: a gradient boosting machine, a support vector machine, or a neural network; determining, based on the single computer-readable geohash encoding value, values of one or more independent variables of the one or more machine learning models associated with the subject home; and applying the one or more machine learning models to the values of the one or more independent variables to produce an estimate of the value of the subject home, and wherein, for each of the plurality of region sizes, independent variables of the one or more machine learning models comprise: an independent variable identifying a region of a region size containing the subject home, one or more independent variables each identifying a neighboring region of the region size that borders the region of the region size containing the subject home, wherein, for a respective region size, the region containing the subject home is non-overlapping with the neighboring region, and one or more independent variables each determined from an aggregation of values of a home attribute across all homes, in the region of the region size containing the subject home, for which a value of the home attribute is available.
2 . The method of claim 1 , further comprising: measuring a difference between theestimate of the value of the subject home and a selling price of the subject home.
3 . The method of claim 1 , wherein the two or more region sizes are of different sizes.
4 . The method of claim 1 , further comprising, for each region, defining the region to include an area specified for the region size of the region.
5 . The method of claim 1 , further comprising, for each region, defining the region to include an area within a percentage of a target area specified for the region size of the region.
6 . The method of claim 1 , further comprising, for each region, defining the region to include a number of homes specified for the region size of the region.
7 . The method of claim 1 , further comprising, for each region, defining the region to include a percentage of a number of homes specified for the region size of the region.
8 . The method of claim 1 , wherein the aggregation of the values of the home attribute comprises computing at least one of: a median, a mean, or a variance of the values of the home attribute across the homes in the region.
9 . A computer-readable non-transitory medium havinginstructions that cause a computer to perform a method for using one or more machine learning models to estimate a value of a subject home, the method comprising: automatically generating, for each region corresponding to each of a plurality of region sizes and based on a latitude/longitude pair associated with the subject home, a single computer-readable geohash encoding value, wherein the single computer-readable geohash encoding value of a first region size is generated by discarding a least significant digit of the single computer-readable geohash encoding value of a second region size, the first region size being at a next-less-granular level than the second region size; creating a training set comprising data of sale transactions of a plurality of homes selected from homes in two or more region sizes; periodically training the one or more machine learning models using the created training set, wherein the one or more machine learning models comprises at least one of: a gradient boosting machine, a support vector machine, or a neural network; determining, based on the single computer-readable geohash encoding value, values of one or more independent variables of the one or more machine learning models associated with the subject home; and applying the one or more machine learning models to the values of the one or more independent variables to produce an estimate of the value of the subject home, and wherein, for each of the plurality of region sizes, independent variables of the one or more machine learning models comprise: an independent variable identifying a region of a region size containing the subject home, one or more independent variables each identifying a neighboring region of the region size that borders the region of the region size containing the subject home, wherein, for a respective region size, the region containing the subject home is non-overlapping with the neighboring region, and one or more independent variables each determined from an aggregation of values of a home attribute across all homes, in the region of the region size containing the subject home, for which a value of the home attribute is available.
10 . The computer-readable non-transitory medium of claim 9 , wherein the method further comprises: measuring a difference between theestimate of the value of the subject home and a selling price of the subject home.
11 . The computer-readable non-transitory medium of claim 9 , wherein the two or more region sizes are of different sizes.
12 . The computer-readable non-transitory medium of claim 9 , further comprising, for each region, defining the region to include (a) an area specified for the region size of the region, (b) an area within a percentage of a target area specified for the region size of the region, (c) a number of homes specified for the region size of the region, or (d) a percentage of the number of homes specified for the region size of the region.
13 . The computer-readable non-transitory medium of claim 9 , wherein the aggregation of the values of the home attribute comprises computing at least one of: a median, a mean, or a variance of the values of the home attribute across the homes in the region.
14 . An apparatus comprising a processor and a memory, the processor being configured to implement a method for using one or more machine learning models to estimate a value of a subject home, the method comprising: automatically generating, for each region corresponding to each of a plurality of region sizes and based on a latitude/longitude pair associated with the subject home, a single computer-readable geohash encoding value, wherein the single computer-readable geohash encoding value of a first region size is generated by discarding a least significant digit of the single computer-readable geohash encoding value of a second region size, the first region size being at a next-less-granular level than the second region size; creating a training set comprising data of sale transactions of a plurality of homes selected from homes in two or more region sizes; periodically training the one or more machine learning models using the created training set, wherein the one or more machine learning models comprises at least one of: a gradient boosting machine, a support vector machine, or a neural network; determining, based on the single computer-readable geohash encoding value, values of one or more independent variables of the one or more machine learning models associated with the subject home; and applying the one or more machine learning models to the values of the one or more independent variables to produce an estimate of the value of the subject home, and wherein, for each of the plurality of region sizes, independent variables of the one or more machine learning models comprise: an independent variable identifying a region of a region size containing the subject home, one or more independent variables each identifying a neighboring region of the region size that borders the region of the region size containing the subject home, wherein, for a respective region size, the region containing the subject home is non-overlapping with the neighboring region, and one or more independent variables each determined from an aggregation of values of a home attribute across all homes, in the region of the region size containing the subject home, for which a value of the home attribute is available.
15 . The apparatus of claim 14 , further comprising: measuring a difference between theestimate of the value of the subject home and a selling price of the subject home.
16 . The apparatus of claim 14 , wherein the two or more region sizes are of different sizes.
17 . The apparatus of claim 14 , further comprising, for each region, defining the region to include an area specified for the region size of the region.
18 . The apparatus of claim 14 , further comprising, for each region, defining the region to include an area within a percentage of a target area specified for the region size of the region.
19 . The apparatus of claim 14 , further comprising, for each region, defining the region to include a number of homes specified for the region size of the region or a percentage of the number of homes specified for the region size of the region.
20 . The apparatus of claim 14 , wherein the aggregation of the values of the home attribute comprises computing at least one of: a median, a mean, or a variance of the values of the home attribute across the homes in the region.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application is a division of U.S. patent application Ser. No. 16/457,390, filed on Jun. 28, 2019, entitled “VALUATION OF HOMES USING GEOGRAPHIC REGIONS OF VARYING GRANULARITY,” the disclosure of which is incorporated herein by reference in its entirety. This application is related to the following applications, each of which is hereby incorporated by reference in its entirety: U.S. patent application Ser. No. 11/347,000 filed on Feb. 3, 2006 (now U.S. Pat. No. 8,676,680); U.S. patent application Ser. No. 11/347,024 filed on Feb. 3, 2006 (now U.S. Pat. No. 7,970,674); U.S. patent application Ser . No. 11/524,048 filed on Sep. 19, 2006 (now U.S. Pat. No. 8,515,839); U.S. patent application Ser. No. 11/971,758 filed on Jan. 9, 2008 (now U.S. Pat. No. 8,140,421); U.S. patent application Ser. No. 13/044,480 filed on Mar. 9, 2011; U.S. Provisional Patent Application No. 61/706,241 filed on Sep. 27, 2012; U.S. patent application Ser. No. 15/715,098 filed on Sep. 25, 2017; U.S. Provisional Patent Application No. 61/761, 153 filed on Feb. 5, 2013; U.S. patent application Ser. No. 14/640,860 filed on Mar. 6, 2015; U.S. Provisional Patent Application No. 61/939,268 filed on Feb. 13, 2014; U.S. patent application Ser. No. 15/439,388 filed on Feb. 22, 2017; U.S. patent application Ser. No. 11/525,114 filed on Sep. 20, 2006; U.S. patent application Ser. No. 12/924,037 filed on Sep. 16, 2010; U.S. patent application Ser. No. 13/245,584 filed on Sep. 26, 2011 (now U.S. Pat. No. 10,078,679); and U.S. patent application Ser. No. 16/178,457 filed on Nov. 1, 2018; U.S. Provisional Patent Application No. 62/821, 159 filed on Mar. 20, 2019; and U.S. patent application Ser. No. 16/423,873 filed on May 28, 2019. BACKGROUND In many roles, it can be useful to be able to accurately determine the value of residential real estate properties (“homes”). As examples, by using accurate values for homes: taxing bodies can equitably set property tax levels; sellers and their agents can optimally set listing prices; buyers and their agents can determine appropriate offer amounts; insurance firms can properly value their insured assets; and mortgage companies can properly determine the value of the assets securing their loans. A variety of conventional approaches exist for valuing houses. For a house that was very recently sold, one approach is attributing its selling price as its value. Another widely-used conventional approach to valuing houses is appraisal, where a professional appraiser determines a value for a house by comparing some of its attributes to the attributes of similar nearby homes that have recently sold (“comps”). The appraiser arrives at an appraised value by subjectively adjusting the sale prices of the comps to reflect differences between the attributes of the comps and the attributes of the house being appraised, then aggregating these adjusted sale prices, such as by determining their mean. A further widely-used conventional approach to valuing houses involves statistical modeling. For particular geographic region, such as a county, home sale transactions are used together with attributes of the sold homes to train a model capable of predicting the value of an arbitrarily-selected home within the geographic region based upon its attributes. This model can then be applied to the attributes of any home in the geographic area in order to estimate the value of this home. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram showing some of the components typically incorporated in at least some of the computer systems and other devices on which the facility operates. FIG. 2 is a flow diagram showing a process performed by the facility in some embodiments to establish a trained home valuation model. FIG. 3 is a flow diagram showing a process performed by the facility in some embodiments to establish independent variables for a home. FIG. 4 is a map diagram showing an example of identifying a region containing the home's geographic location. FIG. 5 is a map diagram showing a region of a larger region size identified by the facility. FIG. 6 is a flow diagram showing a process performed by the facility in some embodiments to create one or more independent variables based on an identified region. FIG. 7 is a table diagram showing sample contents of a region id independent variable table in which the facility in some embodiments stores independent variables it creates for a particular home containing region identifiers. FIG. 8 is a table diagram showing sample contents of a region aggregate independent variable table used by the facility in some embodiments to store region aggregate independent variables created by the facility for a particular home. FIG. 9 is a flow diagram showing a process performed by the facility in some embodiments to estimate the value of a home using the model trained by the facility. DETAILED DESCRIPTION The inventors have recognized that the conventional