-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix: Add README.md and processing scripts
Data processing script for original dataset from Fannie Mae and a project README.md file are added to the repo. * Dataset processing files are now under /processing folder * A new README.md file is added in the root folder.
- Loading branch information
1 parent
0bcac6c
commit 4ca1de9
Showing
10 changed files
with
895 additions
and
36 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
# Smart Underwriter | ||
|
||
## Background | ||
|
||
The underwriting process is gradually becoming more and more | ||
automatic. Fannie Mae and Freddie Mac, two major US housing GSE (government | ||
sponsor enterprise) have their own AUS (automated underwriting system): [Desktop | ||
Underwriter](https://www.fanniemae.com/singlefamily/desktop-underwriter "Desktop | ||
Underwriter") and [Loan Prospector](http://www.loanprospector.com/ "Loan | ||
Prospector"). The AUS system is great as it can provide an objective and fast | ||
decision based on the mortgage data. | ||
|
||
Fannie Mae is the largest housing mortgage backer in US housing market. It has | ||
released "a subset of Fannie Mae’s 30-year, fully amortizing, full | ||
documentation, single-family, conventional fixed-rate mortgages" on its website | ||
to "promote better understanding of the credit performance of Fannie Mae | ||
mortgage loans". This data is also a perfect source to build our own mortgage | ||
risk assessment model. | ||
|
||
In this project, those Fannie Mae data was downloaded, compiled, aggregated and | ||
then fed into a machine learning model to build a credit risk prediction | ||
model. You can find the demo site at [Here](). | ||
|
||
## Workflow | ||
|
||
To recreate data processing, modeling and web development. You can follow the | ||
following steps: | ||
|
||
1. Download the dataset at | ||
[Fannie Mae's website](http://www.fanniemae.com/portal/funding-the-market/data/loan-performance-data.html | ||
"Download the data"). You can use the `download.sh` script in the `processing` | ||
folder. But keep in mind that you will need to supply a separate cookie file | ||
in order to download the data. If you are using Firefox, you can install | ||
[Export Cookie](https://addons.mozilla.org/en-US/firefox/addon/export-cookies/) | ||
extension. | ||
|
||
2. Aggregate the loan performance data. At this time, I am only focusing on the | ||
terminal status of the loan. `data_process.py` will get the last status of | ||
each loan and disgard any intermediate status. In the future, a time-series | ||
based model will be developed to predict the time dependent loan status. | ||
**CAUTION:** Data was processed on a DO droplet containing 16G | ||
memory. Current script uses Python pandas to process the data. My plan is to | ||
rewrite the whole thing by Spark. | ||
|
||
3. Further aggregate quarterly data into yearly and then | ||
multi-years. `merged-quarter.py` and `merged-year.py`. | ||
|
||
4. Use the `learning.py` script to do the machine learning. Currently, logistic | ||
model and stochastic gradient descent (SGD) based support vector machine algorithm | ||
are used. SGD gives better AUC-ROC value so it is picked. | ||
|
||
5. Run flask web server by using `python3 run.py`. | ||
|
||
## License | ||
|
||
(C) copyright by Zhenqing Li. GPL v3 | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,38 +1,57 @@ | ||
{% extends "base.html" %} | ||
{% block content %} | ||
<h1>Prediction of Default of Home Morgage</h1> | ||
<form action="/index" method="post" name="input_morgage_data"> | ||
<!-- {{ form.csrf_token }} --> | ||
<p> | ||
{{form.loan_amount.label}}: {{form.loan_amount(size=10)}} <br> | ||
{% for error in form.loan_amount.errors %} | ||
<span style="color: red;">[{{ error }}]</span> | ||
{% endfor %}<br> | ||
{{form.buyer_credit.label}}: {{form.buyer_credit()}} <br> | ||
{% for error in form.loan_amount.errors %} | ||
<span style="color: red;">[{{ error }}]</span> | ||
{% endfor %}<br> | ||
{{form.cobuyer_credit.label}} (if available): {{form.cobuyer_credit()}} <br> | ||
{% for error in form.loan_amount.errors %} | ||
<span style="color: red;">[{{ error }}]</span> | ||
{% endfor %}<br> | ||
{{form.loan_to_value.label}}: {{form.loan_to_value()}} <br> | ||
{% for error in form.loan_amount.errors %} | ||
<span style="color: red;">[{{ error }}]</span> | ||
{% endfor %}<br> | ||
{{form.debt_to_income.label}}: {{form.debt_to_income()}} <br> | ||
{% for error in form.loan_amount.errors %} | ||
<span style="color: red;">[{{ error }}]</span> | ||
{% endfor %}<br> | ||
</p> | ||
<p> | ||
{{form.loan_state.label}}: {{form.loan_state()}} <br> | ||
{{form.loan_purpose.label}}: {{form.loan_purpose()}} <br> | ||
{{form.property_type.label}}: {{form.property_type()}} <br> | ||
{{form.occupancy_type.label}}: {{form.occupancy_type()}} <br> | ||
</p> | ||
<p><input type="submit" value="Get Loan Prediction"></p> | ||
</form> | ||
<p> This morgage application is {{result}} </p> | ||
|
||
<div class="container"> | ||
<div class="jumbotron"> | ||
<h2>Prediction of Default of Home Morgage</h2> | ||
<form action="/index" method="post" name="input_morgage_data"> | ||
<!-- {{ form.csrf_token }} --> | ||
<p> | ||
{{form.loan_amount.label}}: {{form.loan_amount(size=10)}} <br> | ||
{% for error in form.loan_amount.errors %} | ||
<span style="color: red;">[{{ error }}]</span> | ||
{% endfor %}<br> | ||
{{form.buyer_credit.label}}: {{form.buyer_credit()}} <br> | ||
{% for error in form.loan_amount.errors %} | ||
<span style="color: red;">[{{ error }}]</span> | ||
{% endfor %}<br> | ||
{{form.cobuyer_credit.label}} (if available): {{form.cobuyer_credit()}} <br> | ||
{% for error in form.loan_amount.errors %} | ||
<span style="color: red;">[{{ error }}]</span> | ||
{% endfor %}<br> | ||
{{form.loan_to_value.label}}: {{form.loan_to_value()}} <br> | ||
{% for error in form.loan_amount.errors %} | ||
<span style="color: red;">[{{ error }}]</span> | ||
{% endfor %}<br> | ||
{{form.debt_to_income.label}}: {{form.debt_to_income()}} <br> | ||
{% for error in form.loan_amount.errors %} | ||
<span style="color: red;">[{{ error }}]</span> | ||
{% endfor %}<br> | ||
</p> | ||
<p> | ||
{{form.loan_state.label}}: {{form.loan_state()}} <br> | ||
{{form.loan_purpose.label}}: {{form.loan_purpose()}} <br> | ||
{{form.property_type.label}}: {{form.property_type()}} <br> | ||
{{form.occupancy_type.label}}: {{form.occupancy_type()}} <br> | ||
</p> | ||
<p><input type="submit" value="Get Loan Prediction"></p> | ||
</form> | ||
<p> This morgage application is {{result}} </p> | ||
</div> | ||
</div> | ||
|
||
<div class="container"> | ||
<div class="jumbotron"> | ||
<h4 id="about">About</h4> | ||
<p>This model is build on Fannie Mae's data from 2000 to 2012, | ||
downloaded | ||
from <a href="http://www.fanniemae.com/portal/funding-the-market/data/loan-performance-data.html">Fannie | ||
Mae Single Family Housing Data</a>. The data was processed by | ||
Python with the help of numpy/scipy, pandas and scikit-learn. You | ||
can find the source code | ||
in <a href="https://github.com/DigitalPig/SmartUnderwriter">This</a> | ||
Github repo.<br></p> | ||
<p>© 2016 Zhenqing Li</p> | ||
</div> | ||
</div> | ||
{% endblock %} |
Oops, something went wrong.