Latest news with #syntheticdata


Forbes
20-06-2025
- Business
- Forbes
Great AI Needs Great (Synthetic) Data
Jennifer Chase is Chief Marketing Officer and Executive Vice President at SAS. Every year, I am asked what marketing innovation I am most excited about, and for 2025, my answer may be surprising. I know you're probably expecting me to say AI agents or AI-created interactive marketing assets, but bear with me as I explain just why I think synthetic data generation should be the most hotly anticipated tech by marketers this year. As marketers, we are not data poor. However, we are data starved. And by that, I mean marketers are starved of cost-effective, high-quality data that we can use to create hyper-personalized marketing. For AI models to effectively run, the model input data must be complete and of good quality. And too often, our datasets have gaping holes. Synthetic data generation is a component of generative AI (GenAI), and with this tech, marketers can generate artificial datasets that share the attributes and characteristics of real customer data, but without any liabilities and limitations. According to Gartner, 'By 2026, 75% of businesses will use generative AI to create synthetic customer data, up from less than 5% in 2023.' Why is this important? Well, for marketers, I believe there are three main reasons: We need good quality data for the development of AI applications. However, this can be a challenge when privacy considerations and regulations are of utmost importance. Synthetic data can help with data privacy by creating data with the same patterns as real data, but with none of the identifying information. This level of data anonymity can help us safeguard personal data. As communications and marketing leaders, we are the trusted stewards of customer data, and I am excited about the role synthetic data can play in helping us protect it. Eradicating bias in our datasets should be a paramount consideration for all marketers. Not only is it unethical, but it also leads to inaccurate analyses that can negatively affect campaign and customer journey effectiveness. The wonder of synthetic data generation is that we can create more representative datasets. For instance, certain groups may be underrepresented, leading to biased model predictions. However, using synthetic data generation, we can create supplementary data for underrepresented groups, ensuring a fair distribution. Additionally, synthetic data can be designed to exclude biases that are often present in datasets. Organizations spend a lot of time acquiring and preparing data. And it's not a one-time process. Data decays. The generation of synthetic data can help limit some of the associated costs that come with that decay. A great way to improve efficiency using synthetic data in marketing is using it to perform look-alike modeling. Using generated data with the same features, structures and attributes as real-life datasets can help brands identify new audiences quickly and at-scale. Something marketers probably don't spend much time thinking about is the cost of data labeling. This is a hidden cost associated with data analysis. Annotating large datasets is time-consuming and expensive. When using data-generation technology, make sure it's designed to include data labeling automatically. Synthetic data has tremendous upside, from privacy protection to mitigating bias and reducing costs, all while improving overall marketing effectiveness. However, with this potential comes responsibility. Marketers must establish clear governance within their organization around when to use synthetic data. Beyond this, make sure you have defined guidelines for labeling and identifying the use of synthetic data to avoid misuse and misunderstanding. As a CMO, I'm always looking for ways to reduce costs while not reducing effectiveness, and synthetic data fits this bill for me. With the myriad ways it can aid marketing, especially in rapid experimentation, I believe synthetic data is going to cement its place in the continued evolution of marketing. Forbes Communications Council is an invitation-only community for executives in successful public relations, media strategy, creative and advertising agencies. Do I qualify?


Tahawul Tech
19-06-2025
- Health
- Tahawul Tech
drug discovery Archives
The synthetic data, which SandboxAQ is releasing publicly, can be used to train AI models that can predict whether a new drug molecule is likely to stick to the protein researchers are targeting.


Tahawul Tech
19-06-2025
- Health
- Tahawul Tech
data creation Archives
The synthetic data, which SandboxAQ is releasing publicly, can be used to train AI models that can predict whether a new drug molecule is likely to stick to the protein researchers are targeting.