Gender Inference using Statistical Name Characteristics in Twitter
Juergen Mueller and Gerd Stumme
Much attention has been given to the task of gender inference of Twitter users. Although names are strong gender indicators, the names of Twitter users are rarely used as a feature; probably due to the high number of ill-formed names, which cannot be found in any name dictionary. Instead of relying solely on a name database, we propose a novel name classifier. Our approach extracts characteristics from the user names and uses those in order to assign the names to a gender. This enables us to classify international first names as well as ill-formed names.
5th ASE International Conference on Social Informatics, SocInfo 2016, Union, NJ, USA, August 15-17, 2016. Proceedings, pp. 47:1-47:8, ACM, New York, NY, USA, 2016