Dealing with Class Imbalance in Uplift Modeling-Efficient Data Preprocessing via Oversampling and Matching

dc.coverageDOI: 10.1109/ACCESS.2024.3511339
dc.creatorVairetti, Carla
dc.creatorJosé Marfán, María
dc.creatorMaldonado, Sebastián
dc.date2024
dc.date.accessioned2026-01-05T21:15:32Z
dc.date.available2026-01-05T21:15:32Z
dc.description<p>Uplift modeling is a widely recognized predictive approach used to identify individuals who are more likely to respond positively to an intervention or treatment, such as a marketing campaign. However, this approach can be negatively affected by the class-imbalance problem, which occurs when the distribution of target classes is highly skewed. For instance, in a class-imbalanced uplift modeling task, only a small fraction typically responds to a marketing campaign that leads to a purchase. In this paper, we propose a novel resampling scheme that addresses the class-imbalance issue by combining intelligent oversampling and propensity score matching (PSM). By leveraging intelligent oversampling in observational studies, we alleviate the class-imbalance problem and mitigate the negative effects of PSM in terms of information loss. We introduce two efficient resampling schemes that intelligently combine these approaches. To ensure scalability and effectiveness, we adopt a distributed framework based on MapReduce and utilize a hybrid spill trees algorithm for efficient nearest neighbor search. Our experimental results demonstrate the advantages of the proposed method, achieving statistically superior predictive performance compared to other resampling approaches while maintaining efficiency in terms of overall running times.</p>eng
dc.identifierhttps://investigadores.uandes.cl/en/publications/9c8ecc93-6e3c-43db-9a6e-fc429563417d
dc.identifier.urihttps://repositorio.uandes.cl/handle/uandes/66629
dc.languageeng
dc.rightsinfo:eu-repo/semantics/openAccess
dc.sourcevol.12 (2024) p.188993-189008
dc.subjectComputational and artificial intelligence
dc.subjectbig data
dc.subjectclass imbalance
dc.subjectuplift modeling
dc.titleDealing with Class Imbalance in Uplift Modeling-Efficient Data Preprocessing via Oversampling and Matchingeng
dc.typeArticleeng
dc.typeArtículospa
Files
Collections