Data Analysis

GitHub - salesforce/Merlion: Merlion: A Machine Learning Framework for Time Series Intelligence

Merlion is a comprehensive Python library for time series intelligence, offering end-to-end machine learning capabilities for forecasting, anomaly detection, and change point detection. The library features standardized data loading, diverse models, AutoML capabilities, and practical post-processing rules, while supporting both univariate and multivariate analysis with distributed computation via PySpark.

GitHub - iamtelescope/telescope: Web-based log viewer UI. Explore logs data stored in ClickHouse

Telescope is a web application for exploring log data stored in ClickHouse databases, offering intuitive filtering, searching, and analysis capabilities. The platform provides multiple connection management, customizable visualizations, and GitHub-based authentication with permission controls. Currently in development, Telescope plans to implement additional features like custom SQL queries, live log trailing, and expanded authentication methods.

Google Co-Scientist AI cracks superbug problem in two days! — because it had been fed the team’s previous paper with the answer in it

Google's Co-Scientist AI tool, powered by Gemini LLM, made headlines for supposedly solving a superbug problem in 48 hours, but it was later revealed that the solution was derived from previously published research. Similar patterns of overstated achievements were found in Google's other AI research claims, including drug discovery and materials synthesis.

James Pae

An analytical study investigates the correlation between kebab restaurant quality and proximity to train stations in Paris using Google Places API and geospatial analysis. Despite thorough data collection of 400 establishments and complex spatial analysis, results showed only a weak correlation (0.091 Pearson coefficient), leaving the hypothesis largely unconfirmed.

League of Legends data scraping the hard and tedious way for fun

A developer reverse-engineered League of Legends' replay system to extract high-fidelity gameplay data by decrypting game packets and emulating game engine functions, achieving better performance than existing approaches. The work demonstrates methods for accessing detailed match data including precise player positions, ability usage, and damage calculations that are not available through official APIs.

Ambsheets: Spreadsheets for exploring scenarios

A new spreadsheet concept called 'Ambsheets' introduces 'amb values' that allow cells to hold multiple values simultaneously, enabling easier scenario exploration and comparison. The innovation improves upon traditional spreadsheet limitations and Excel's What-If Analysis by automatically computing all possible combinations while maintaining a seamless user interface.