As the synthetic data is produced from a model there is not a one-to-one mapping between the synthetic records (e.g., individual consumers) created and the “real” source records. You then apply that model to generate the synthetic data. The resultant model “captures the patterns and statistical properties of that source data set,” explains Khaled. How Synthetic Data is Generatedīeginning with the source dataset (the actual data, say of individual health characteristics or consumer actions) you create a model of that dataset typically using various AI machine learning techniques. He was joined by Mike Hintze, Partner at Hintze Law PLLC and a Future of Privacy Forum Senior Fellow. Khaled El Emam, CEO of Replica Analytics, provided a highly accessible technical overview of Data Synthesis and Synthetic Data. The implications of being able to create and share data providing insights into cohorts and segments without impinging on privacy has profound implications for the adtech and martech ecosystem struggling with privacy-centric moves by Apple, Google, and other platforms.ĭuring the Spokes Privacy Conference, Dr. It is artificial data that does not map back to any actual natural person. Importantly, the resulting data is not the actual data that has been pseudoanonymized or anonymized. Rather it is data that has been generated by a computer – i.e., synthetic data generation tools – that match the key statistical properties of the real sample data. ![]() Synthetic data, as its name implies, is not actual data taken from real world events or individuals’ attributes. There is, however, a rapidly emerging solution to the use and sharing of data across organizations and, indeed, across borders: Data synthesis ![]() This impedes not only the efforts of organizations to rationalize resources (e.g., workforce analytics, resource allocations, and consumer insights), but research into health, medicine, the social sciences, and other endeavors that benefit society at large.įinding a balance between privacy and the potential benefits of sharing personally identifiable or sensitive data (e.g., personal health information) seems an intractable problem putting privacy advocates and those who wish to use data for commercial or research purposes at loggerheads. Regulation, cost, and other factors can hinder the great many benefits of access to data for analysis.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |