Skip to content

Identified the most important factors contributing to user adoption for a product. Achieved a 96% accuracy with an SVM model, and found user length and opting into the mailing list as the most significant predictors.

Notifications You must be signed in to change notification settings

merrillm1/OMSCS_Computational_Analytics_EC_Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Results

The aim of this project is to predict the factors that contribute most to user adoption. User adoption here is defined as a user who has logged in at least three times within any 7 day period. Factors here include the following:

- name:  the  user's  name
- object_id:   the  user's  id
- email:  email  address
- creation_source:   how  their  account  was  created.  This  takes  on  one
  of  5  values:
    - PERSONAL_PROJECTS:  invited  to  join  another  user's
      personal  workspace
    - GUEST_INVITE:  invited  to  an  organization  as  a  guest
      (limited  permissions)
    - ORG_INVITE:  invited  to  an  organization  (as  a  full  member)
    - SIGNUP:  signed  up  via  the  website
    - SIGNUP_GOOGLE_AUTH:  signed  up  using  Google
- Authentication  (using  a  Google  email  account  for  their  login id)
- creation_time:  when  they  created  their  account
- last_session_creation_time:   unix  timestamp  of  last  login
- opted_in_to_mailing_list:  whether  they  have  opted  into  receiving
  marketing  emails
- enabled_for_marketing_drip:  whether  they  are  on  the  regular
  marketing  email  drip
- org_id:   the  organization  (group  of  users)  they  belong  to
- invited_by_user_id:   which  user  invited  them  to  join  (if  applicable).

A second dataframe was also provided which details each users activity.

Feature selection was made using an SVM model, as our dataframe was relatively small (<10^5) and sparse. The SVM model was also chosen for its ability to find the maximal margin between a nonlinear approximator.

The results here suggest that the user length and and being opted in to the mailing list are among the most significant indicators of a user becoming adopted. Exploratory data analysis revealed a slight correlation between how a user was created and whether they became adopted, this was also reflected in the model results. Those invited by an organization or a guest were also strong indicators of adoption.

I do have a few reservations about these results however. It is natural to assume that a longer time frame for a user would open greater opportunities for them to become adopted. For those who had signed up towards the end of the documented period, the likelihood that they became adopted reduced, as evidenced by a time series plot in the notebook. More work can be done here to limit the engagements to those who have been enrolled for at least a designated time period.

The model results seemed promising. The accuracy for the SVM model was 96% and the AUROC was close to perfect. The data here was highly imbalanced however, so the more appropriate evaluation metric here is the F1 score, our model gave an F1 score of approximately an 87%.

About

Identified the most important factors contributing to user adoption for a product. Achieved a 96% accuracy with an SVM model, and found user length and opting into the mailing list as the most significant predictors.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published