Compare commits
22 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| adc002b38d | |||
| 5994d8ae7b | |||
| f68ada7204 | |||
| 2217bd5855 | |||
| fa5d59b069 | |||
| 11e6348a8e | |||
| c078017b5f | |||
| f3da58e202 | |||
| 57e969a647 | |||
| 0340dea4f8 | |||
| 2b6aebdab4 | |||
| a6292d2d0f | |||
| a44c2bfc04 | |||
| 33621bdec4 | |||
| d1f562ce94 | |||
| 5e00df4e13 | |||
| 293d3e26a6 | |||
| ea55c7ad6d | |||
| 12e7cffca1 | |||
| 595237c172 | |||
| e57869374b | |||
| 487d59512a |
5
.dockerignore
Normal file
5
.dockerignore
Normal file
@@ -0,0 +1,5 @@
|
||||
fly.toml
|
||||
.git/
|
||||
__pycache__/
|
||||
.envrc
|
||||
.venv/
|
||||
15
Dockerfile
Normal file
15
Dockerfile
Normal file
@@ -0,0 +1,15 @@
|
||||
FROM python:3.13.1 AS builder
|
||||
|
||||
ENV PYTHONUNBUFFERED=1 \
|
||||
PYTHONDONTWRITEBYTECODE=1
|
||||
WORKDIR /app
|
||||
|
||||
|
||||
RUN python -m venv .venv
|
||||
COPY requirements.txt ./
|
||||
RUN .venv/bin/pip install -r requirements.txt
|
||||
FROM python:3.13.1-slim
|
||||
WORKDIR /app
|
||||
COPY --from=builder /app/.venv .venv/
|
||||
COPY . .
|
||||
CMD ["/app/.venv/bin/flask", "run", "--host=0.0.0.0", "--port=8080"]
|
||||
148
README.md
148
README.md
@@ -1,18 +1,22 @@
|
||||
# Torn User Activity Scraper
|
||||
# Torn User Activity Tracker
|
||||
|
||||
> [!WARNING]
|
||||
> **Development is still in its early stages; do not put it to productive use!**
|
||||
|
||||
## Features
|
||||
|
||||
- Start and stop scraping user activity data
|
||||
- View real-time logs
|
||||
- Download data and log files
|
||||
- View scraping results and statistics
|
||||
- View scraping results
|
||||
- Plugin based analysis system
|
||||
- Toggle between light and dark mode
|
||||
|
||||
**Note:** Many features are not fully implemented yet, but the activity tracker/grabber works as intended.
|
||||
|
||||
## Planned Features
|
||||
|
||||
- Additional analyses
|
||||
- Additional analyses plugins
|
||||
- Selector for Torn API data to choose which data shall be tracked
|
||||
- Improved / fixed log viewer
|
||||
|
||||
@@ -24,6 +28,21 @@
|
||||
- Flask-WTF
|
||||
- Pandas
|
||||
- Requests
|
||||
- Redis
|
||||
- Celery
|
||||
|
||||
Redis currently has to run locally, but this will change in the future. To change this, see file tasks.py:
|
||||
|
||||
```python
|
||||
# tasks.py
|
||||
def get_redis():
|
||||
return redis.StrictRedis(
|
||||
host='localhost',
|
||||
port=6379,
|
||||
db=0,
|
||||
decode_responses=True
|
||||
)
|
||||
```
|
||||
|
||||
## Installation
|
||||
|
||||
@@ -93,6 +112,129 @@ flask run
|
||||
|
||||
2. Open your web browser and navigate to `http://127.0.0.1:5000/`.
|
||||
|
||||
## Adding an Analysis Module
|
||||
|
||||
This guide explains how to add a new analysis module using the provided base classes: `BasePlotlyAnalysis` and `BasePlotAnalysis`. These base classes ensure a structured workflow for data preparation, transformation, and visualization.
|
||||
|
||||
### 1. Choosing the Right Base Class
|
||||
Before implementing an analysis module, decide on the appropriate base class:
|
||||
- **`BasePlotlyAnalysis`**: Use this for interactive plots with **Plotly** that generate **HTML** outputs.
|
||||
- **`BasePlotAnalysis`**: Use this for static plots with **Matplotlib/Seaborn** that generate **PNG** image files.
|
||||
- **`BaseAnalysis`**: Use this for any other type of analysis with **text** or **HTML** output for max flexibility.
|
||||
|
||||
### 2. Naming Convention
|
||||
Follow a structured naming convention for consistency:
|
||||
- **File name:** `plotly_<analysis_name>.py` for Plotly analyses, `plot_<analysis_name>.py` for Matplotlib-based analyses.
|
||||
- **Class name:** Use PascalCase and a descriptive suffix:
|
||||
- Example for Plotly: `PlotlyActivityHeatmap`
|
||||
- Example for Matplotlib: `PlotUserSessionDuration`
|
||||
|
||||
### 3. Data Structure
|
||||
The following DataFrame structure is passed to analysis classes:
|
||||
|
||||
| user_id | name | last_action | status | timestamp | prev_timestamp | was_active | hour |
|
||||
|----------|-----------|----------------------|--------|-----------------------------|----------------|------------|------|
|
||||
| XXXXXXX | UserA | 2025-02-08 17:58:11 | Okay | 2025-02-08 18:09:41.867984056 | NaT | False | 18 |
|
||||
| XXXXXXX | UserB | 2025-02-08 17:00:10 | Okay | 2025-02-08 18:09:42.427846909 | NaT | False | 18 |
|
||||
| XXXXXXX | UserC | 2025-02-08 16:31:52 | Okay | 2025-02-08 18:09:42.823201895 | NaT | False | 18 |
|
||||
| XXXXXXX | UserD | 2025-02-06 23:57:24 | Okay | 2025-02-08 18:09:43.179914951 | NaT | False | 18 |
|
||||
| XXXXXXX | UserE | 2025-02-06 06:33:40 | Okay | 2025-02-08 18:09:43.434650898 | NaT | False | 18 |
|
||||
|
||||
Note that the first X rows, depending on the number of the members, will always contain empty values in prev_timestamp as there has to be a previous timestamp ....
|
||||
|
||||
### 4. Implementing an Analysis Module
|
||||
Each analysis module should define two key methods:
|
||||
- `transform_data(self, df: pd.DataFrame) -> pd.DataFrame`: Processes the input data for plotting.
|
||||
- `plot_data(self, df: pd.DataFrame)`: Generates and saves the plot.
|
||||
|
||||
#### Example: Adding a Plotly Heatmap
|
||||
Below is an example of how to create a new analysis module using `BasePlotlyAnalysis`.
|
||||
|
||||
```python
|
||||
import pandas as pd
|
||||
import plotly.graph_objects as go
|
||||
from .basePlotlyAnalysis import BasePlotlyAnalysis
|
||||
|
||||
class PlotlyActivityHeatmap(BasePlotlyAnalysis):
|
||||
"""
|
||||
Displays user activity trends over multiple days using an interactive heatmap.
|
||||
"""
|
||||
name = "Activity Heatmap (Interactive)"
|
||||
description = "Displays user activity trends over multiple days."
|
||||
plot_filename = "activity_heatmap.html"
|
||||
|
||||
def transform_data(self, df: pd.DataFrame) -> pd.DataFrame:
|
||||
df['hour'] = df['timestamp'].dt.hour
|
||||
active_counts = df[df['was_active']].pivot_table(
|
||||
index='name',
|
||||
columns='hour',
|
||||
values='was_active',
|
||||
aggfunc='sum',
|
||||
fill_value=0
|
||||
).reset_index()
|
||||
return active_counts.melt(id_vars='name', var_name='hour', value_name='activity_count')
|
||||
|
||||
def plot_data(self, df: pd.DataFrame):
|
||||
df = df.pivot(index='name', columns='hour', values='activity_count').fillna(0)
|
||||
self.fig = go.Figure(data=go.Heatmap(
|
||||
z=df.values, x=df.columns, y=df.index, colorscale='Viridis',
|
||||
colorbar=dict(title='Activity Count')
|
||||
))
|
||||
self.fig.update_layout(title='User Activity Heatmap', xaxis_title='Hour', yaxis_title='User')
|
||||
```
|
||||
|
||||
#### Example: Adding a Static Matplotlib Plot
|
||||
Below is an example of a Matplotlib-based analysis module using `BasePlotAnalysis`.
|
||||
|
||||
```python
|
||||
import pandas as pd
|
||||
import matplotlib.pyplot as plt
|
||||
from .basePlotAnalysis import BasePlotAnalysis
|
||||
|
||||
class PlotUserSessionDuration(BasePlotAnalysis):
|
||||
"""
|
||||
Displays a histogram of user session durations.
|
||||
"""
|
||||
name = "User Session Duration Histogram"
|
||||
description = "Histogram of session durations."
|
||||
plot_filename = "session_duration.png"
|
||||
|
||||
def transform_data(self, df: pd.DataFrame) -> pd.DataFrame:
|
||||
df['session_duration'] = (df['last_action'] - df['timestamp']).dt.total_seconds()
|
||||
return df
|
||||
|
||||
def plot_data(self, df: pd.DataFrame):
|
||||
plt.figure(figsize=(10, 6))
|
||||
plt.hist(df['session_duration'].dropna(), bins=30, edgecolor='black')
|
||||
plt.xlabel('Session Duration (seconds)')
|
||||
plt.ylabel('Frequency')
|
||||
plt.title('User Session Duration Histogram')
|
||||
```
|
||||
|
||||
### 5. Registering the Module
|
||||
Once you have created your analysis module, it will be automatically discovered by `load_analysis_modules()`, provided it is placed in the correct directory.
|
||||
|
||||
### 6. Running the Analysis
|
||||
To execute the analysis, pass a Pandas DataFrame to its `execute` method:
|
||||
```python
|
||||
from app.analysis.plotly_activity_heatmap import PlotlyActivityHeatmap
|
||||
analysis = PlotlyActivityHeatmap()
|
||||
result_html = analysis.execute(df)
|
||||
print(result_html) # Returns the HTML for embedding the plot
|
||||
```
|
||||
|
||||
### Summary
|
||||
- Choose the appropriate base class (`BasePlotlyAnalysis` or `BasePlotAnalysis`).
|
||||
- Follow the naming convention (`plotly_<name>.py` for Plotly, `plot_<name>.py` for Matplotlib).
|
||||
- Implement `transform_data()` and `plot_data()` methods.
|
||||
- The module will be auto-registered if placed in the correct directory.
|
||||
- Execute the analysis by calling `.execute(df)`.
|
||||
|
||||
This structure ensures that new analyses can be easily integrated and maintained.
|
||||
|
||||
|
||||
|
||||
|
||||
## License
|
||||
|
||||
All assets and code are under the [CC BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/) LICENSE and in the public domain unless specified otherwise.
|
||||
@@ -0,0 +1,56 @@
|
||||
import os
|
||||
from flask import Flask
|
||||
from flask_bootstrap import Bootstrap5
|
||||
from datetime import datetime
|
||||
|
||||
from app.views import register_views
|
||||
from app.api import register_api
|
||||
from app.config import load_config
|
||||
from app.filters import register_filters
|
||||
from app.tasks import celery
|
||||
|
||||
from app.logging_config import init_logger
|
||||
|
||||
def create_app(config=None):
|
||||
app = Flask(__name__)
|
||||
|
||||
if config is None:
|
||||
config = load_config()
|
||||
app.config.update(config)
|
||||
|
||||
os.environ['TZ'] = 'UTC'
|
||||
|
||||
app.config['SECRET_KEY'] = config['DEFAULT']['SECRET_KEY']
|
||||
|
||||
# Move bootstrap settings to root level
|
||||
for key, value in config.get('BOOTSTRAP', {}).items():
|
||||
app.config[key.upper()] = value
|
||||
|
||||
# Initialize Celery
|
||||
celery.conf.update(app.config)
|
||||
|
||||
bootstrap = Bootstrap5(app)
|
||||
|
||||
# Store the entire config in Flask app
|
||||
app.config.update(config)
|
||||
|
||||
# Initialize other settings
|
||||
app.config['SCRAPING_ACTIVE'] = False
|
||||
app.config['SCRAPING_THREAD'] = None
|
||||
app.config['DATA_FILE_NAME'] = None
|
||||
app.config['LOG_FILE_NAME'] = "log/" + datetime.now().strftime('%Y-%m-%d-%H-%M') + '.log'
|
||||
|
||||
# Initialize logging
|
||||
app.logger = init_logger(app.config)
|
||||
|
||||
# Register routes
|
||||
register_views(app)
|
||||
register_api(app)
|
||||
register_filters(app)
|
||||
|
||||
@app.context_processor
|
||||
def inject_main_config():
|
||||
main_config = app.config.get('MAIN', {})
|
||||
return dict(main_config=main_config)
|
||||
|
||||
return app
|
||||
@@ -1,60 +0,0 @@
|
||||
import os
|
||||
import pandas as pd
|
||||
import matplotlib
|
||||
matplotlib.use("Agg") # Prevents GUI-related issues in Flask
|
||||
|
||||
import matplotlib.pyplot as plt
|
||||
import seaborn as sns
|
||||
|
||||
|
||||
def load_data(file_path: str) -> pd.DataFrame:
|
||||
"""Loads the scraped data from a CSV file into a Pandas DataFrame."""
|
||||
if not os.path.exists(file_path):
|
||||
raise FileNotFoundError(f"File {file_path} not found.")
|
||||
|
||||
df = pd.read_csv(file_path)
|
||||
df["timestamp"] = pd.to_datetime(df["timestamp"], errors="coerce")
|
||||
df["last_action"] = pd.to_datetime(df["last_action"], errors="coerce")
|
||||
|
||||
return df
|
||||
|
||||
def generate_statistics(df: pd.DataFrame):
|
||||
"""Generates activity statistics grouped by hour."""
|
||||
df["hour"] = df["timestamp"].dt.hour
|
||||
return df.groupby("hour").size()
|
||||
|
||||
def plot_activity_distribution(df: pd.DataFrame, output_path="activity_distribution.png"):
|
||||
"""Plots user activity distribution and saves the figure."""
|
||||
|
||||
# Ensure the directory exists
|
||||
static_dir = os.path.join("app", "static", "plots")
|
||||
output_path = os.path.join(static_dir, output_path)
|
||||
os.makedirs(static_dir, exist_ok=True)
|
||||
|
||||
# Convert timestamp column to datetime (if not already)
|
||||
if not pd.api.types.is_datetime64_any_dtype(df["timestamp"]):
|
||||
df["timestamp"] = pd.to_datetime(df["timestamp"], errors="coerce")
|
||||
|
||||
df["hour"] = df["timestamp"].dt.hour
|
||||
activity_counts = df.groupby("hour").size().reset_index(name="count")
|
||||
|
||||
# Use non-GUI backend for Matplotlib
|
||||
plt.figure(figsize=(10, 5))
|
||||
|
||||
# Fix Seaborn Warning: Assign `hue` explicitly
|
||||
sns.barplot(x="hour", y="count", data=activity_counts, hue="hour", palette="Blues", legend=False)
|
||||
|
||||
plt.xlabel("Hour of the Day")
|
||||
plt.ylabel("Activity Count")
|
||||
plt.title("User Activity Distribution")
|
||||
plt.xticks(range(0, 24))
|
||||
|
||||
# Save the plot file safely
|
||||
plt.savefig(output_path, bbox_inches="tight")
|
||||
plt.close()
|
||||
|
||||
# Verify the file exists after saving
|
||||
if not os.path.exists(output_path):
|
||||
raise FileNotFoundError(f"Plot could not be saved to {output_path}")
|
||||
|
||||
return output_path
|
||||
34
app/analysis/__init__.py
Normal file
34
app/analysis/__init__.py
Normal file
@@ -0,0 +1,34 @@
|
||||
import os
|
||||
import pkgutil
|
||||
import importlib
|
||||
import inspect
|
||||
from abc import ABC
|
||||
|
||||
from .base import BaseAnalysis
|
||||
|
||||
import pandas as pd
|
||||
|
||||
def load_analysis_modules():
|
||||
analysis_modules = []
|
||||
package_path = __path__[0]
|
||||
|
||||
for _, module_name, _ in pkgutil.iter_modules([package_path]):
|
||||
module = importlib.import_module(f"app.analysis.{module_name}")
|
||||
|
||||
for _, obj in inspect.getmembers(module, inspect.isclass):
|
||||
# Exclude abstract classes (like BasePlotAnalysis)
|
||||
if issubclass(obj, BaseAnalysis) and obj is not BaseAnalysis and not inspect.isabstract(obj):
|
||||
analysis_modules.append(obj()) # Instantiate only concrete classes
|
||||
|
||||
return analysis_modules
|
||||
|
||||
def load_data(file_path: str) -> pd.DataFrame:
|
||||
"""Loads the scraped data from a CSV file into a Pandas DataFrame."""
|
||||
if not os.path.exists(file_path):
|
||||
raise FileNotFoundError(f"File {file_path} not found.")
|
||||
|
||||
df = pd.read_csv(file_path)
|
||||
df["timestamp"] = pd.to_datetime(df["timestamp"], errors="coerce")
|
||||
df["last_action"] = pd.to_datetime(df["last_action"], errors="coerce")
|
||||
|
||||
return df
|
||||
11
app/analysis/base.py
Normal file
11
app/analysis/base.py
Normal file
@@ -0,0 +1,11 @@
|
||||
from abc import ABC, abstractmethod
|
||||
import pandas as pd
|
||||
|
||||
class BaseAnalysis(ABC):
|
||||
name = "Base Analysis"
|
||||
description = "This is a base analysis module."
|
||||
|
||||
@abstractmethod
|
||||
def execute(self, df: pd.DataFrame):
|
||||
"""Run analysis on the given DataFrame"""
|
||||
pass
|
||||
77
app/analysis/basePlotAnalysis.py
Normal file
77
app/analysis/basePlotAnalysis.py
Normal file
@@ -0,0 +1,77 @@
|
||||
import os
|
||||
import pandas as pd
|
||||
import matplotlib.pyplot as plt
|
||||
import seaborn as sns
|
||||
from abc import ABC, abstractmethod
|
||||
|
||||
from .base import BaseAnalysis
|
||||
from app.analysis.data_utils import prepare_data, mk_plotdir
|
||||
|
||||
import matplotlib
|
||||
matplotlib.use('Agg')
|
||||
|
||||
# -------------------------------------------
|
||||
# Base Class for All Plot Analyses
|
||||
# -------------------------------------------
|
||||
class BasePlotAnalysis(BaseAnalysis, ABC):
|
||||
"""
|
||||
Base class for all plot-based analyses.
|
||||
It enforces a structure for:
|
||||
- Data preparation
|
||||
- Transformation
|
||||
- Plot generation
|
||||
- Memory cleanup
|
||||
|
||||
Attributes:
|
||||
plot_filename (str): The filename for the output plot.
|
||||
alt_text (str): The alt text for the plot.
|
||||
"""
|
||||
plot_filename = "default_plot.png"
|
||||
alt_text = "Default Alt Text"
|
||||
|
||||
def execute(self, df: pd.DataFrame):
|
||||
"""
|
||||
Executes the full analysis pipeline.
|
||||
|
||||
Parameters:
|
||||
df (pd.DataFrame): The input DataFrame containing user activity data.
|
||||
|
||||
Returns:
|
||||
str: HTML img tag containing the URL to the generated plot.
|
||||
"""
|
||||
df = prepare_data(df) # Step 1: Prepare data
|
||||
|
||||
paths = mk_plotdir(self.plot_filename)
|
||||
self.output_path, self.plot_url = paths['output_path'], paths['plot_url']
|
||||
|
||||
df = self.transform_data(df) # Step 2: Transform data (implemented by subclass)
|
||||
self.plot_data(df) # Step 3: Create the plot
|
||||
|
||||
plt.savefig(self.output_path, bbox_inches="tight")
|
||||
plt.close()
|
||||
|
||||
del df # Step 4: Free memory
|
||||
return f'<img src="{self.plot_url}" alt="{self.note}">'
|
||||
|
||||
@abstractmethod
|
||||
def transform_data(self, df: pd.DataFrame) -> pd.DataFrame:
|
||||
"""
|
||||
Subclasses must define how they transform the data.
|
||||
|
||||
Parameters:
|
||||
df (pd.DataFrame): The input DataFrame containing user activity data.
|
||||
|
||||
Returns:
|
||||
pd.DataFrame: The transformed DataFrame.
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def plot_data(self, df: pd.DataFrame):
|
||||
"""
|
||||
Subclasses must define how they generate the plot.
|
||||
|
||||
Parameters:
|
||||
df (pd.DataFrame): The transformed DataFrame containing data to be plotted.
|
||||
"""
|
||||
pass
|
||||
73
app/analysis/basePlotlyAnalysis.py
Normal file
73
app/analysis/basePlotlyAnalysis.py
Normal file
@@ -0,0 +1,73 @@
|
||||
import os
|
||||
import pandas as pd
|
||||
import plotly.graph_objects as go
|
||||
from abc import ABC, abstractmethod
|
||||
|
||||
from .base import BaseAnalysis
|
||||
from app.analysis.data_utils import prepare_data, mk_plotdir
|
||||
|
||||
# -------------------------------------------
|
||||
# Base Class for All Plotly Plot Analyses
|
||||
# -------------------------------------------
|
||||
class BasePlotlyAnalysis(BaseAnalysis, ABC):
|
||||
"""
|
||||
Base class for all Plotly plot-based analyses.
|
||||
It enforces a structure for:
|
||||
- Data preparation
|
||||
- Transformation
|
||||
- Plot generation
|
||||
- Memory cleanup
|
||||
|
||||
Attributes:
|
||||
plot_filename (str): The filename for the output plot.
|
||||
alt_text (str): The alt text for the plot.
|
||||
"""
|
||||
plot_filename = "default_plot.html"
|
||||
alt_text = "Default Alt Text"
|
||||
|
||||
def execute(self, df: pd.DataFrame):
|
||||
"""
|
||||
Executes the full analysis pipeline.
|
||||
|
||||
Parameters:
|
||||
df (pd.DataFrame): The input DataFrame containing user activity data.
|
||||
|
||||
Returns:
|
||||
str: HTML iframe containing the URL to the generated plot.
|
||||
"""
|
||||
df = prepare_data(df) # Step 1: Prepare data
|
||||
|
||||
paths = mk_plotdir(self.plot_filename)
|
||||
self.output_path, self.plot_url = paths['output_path'], paths['plot_url']
|
||||
|
||||
df = self.transform_data(df) # Step 2: Transform data (implemented by subclass)
|
||||
self.plot_data(df) # Step 3: Create the plot
|
||||
|
||||
# Save the plot as an HTML file
|
||||
self.fig.write_html(self.output_path)
|
||||
|
||||
del df # Step 4: Free memory
|
||||
return f'<iframe src="{self.plot_url}" width="100%" height="600"></iframe>'
|
||||
|
||||
@abstractmethod
|
||||
def transform_data(self, df: pd.DataFrame) -> pd.DataFrame:
|
||||
"""
|
||||
Subclasses must define how they transform the data.
|
||||
|
||||
Parameters:
|
||||
df (pd.DataFrame): The input DataFrame containing user activity data.
|
||||
|
||||
Returns:
|
||||
pd.DataFrame: The transformed DataFrame.
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def plot_data(self, df: pd.DataFrame):
|
||||
"""
|
||||
Subclasses must define how they generate the plot.
|
||||
|
||||
Parameters:
|
||||
df (pd.DataFrame): The transformed DataFrame containing data to be plotted.
|
||||
"""
|
||||
pass
|
||||
45
app/analysis/data_utils.py
Normal file
45
app/analysis/data_utils.py
Normal file
@@ -0,0 +1,45 @@
|
||||
from flask import current_app, url_for
|
||||
import os
|
||||
import pandas as pd
|
||||
|
||||
def prepare_data(df):
|
||||
"""
|
||||
Prepares the data for analysis by converting timestamps, calculating previous timestamps,
|
||||
determining active status, and extracting the hour from the timestamp.
|
||||
|
||||
Parameters:
|
||||
df (pd.DataFrame): The input DataFrame containing user activity data.
|
||||
|
||||
Returns:
|
||||
pd.DataFrame: The processed DataFrame with additional columns for analysis.
|
||||
|
||||
The returned DataFrame will have the following columns:
|
||||
user_id name last_action status timestamp prev_timestamp was_active hour
|
||||
0 12345678 UserName 2025-02-08 17:58:11 Okay 2025-02-08 18:09:41.867984056 NaT False 18
|
||||
"""
|
||||
df["timestamp"] = pd.to_datetime(df["timestamp"])
|
||||
df["last_action"] = pd.to_datetime(df["last_action"])
|
||||
df["prev_timestamp"] = df.groupby("user_id")["timestamp"].shift(1)
|
||||
df["was_active"] = (df["timestamp"] - df["last_action"]) <= pd.Timedelta(seconds=60)
|
||||
df["was_active"] = df["was_active"].fillna(False)
|
||||
df['hour'] = df['timestamp'].dt.hour
|
||||
return df
|
||||
|
||||
def mk_plotdir(output_filename):
|
||||
"""
|
||||
Creates the directory for storing plots and generates the output path and URL for the plot.
|
||||
|
||||
Parameters:
|
||||
output_filename (str): The filename for the output plot.
|
||||
|
||||
Returns:
|
||||
dict: A dictionary containing the output path and plot URL.
|
||||
"""
|
||||
plots_dir = os.path.join(current_app.root_path, "static", "plots")
|
||||
os.makedirs(plots_dir, exist_ok=True)
|
||||
|
||||
output_path = os.path.join(plots_dir, output_filename)
|
||||
|
||||
plot_url = url_for('static', filename=f'plots/{output_filename}', _external=True)
|
||||
|
||||
return {'output_path': output_path, 'plot_url': plot_url}
|
||||
51
app/analysis/plot_bar_activity-user.py
Normal file
51
app/analysis/plot_bar_activity-user.py
Normal file
@@ -0,0 +1,51 @@
|
||||
import pandas as pd
|
||||
import matplotlib.pyplot as plt
|
||||
import seaborn as sns
|
||||
from .basePlotAnalysis import BasePlotAnalysis
|
||||
from flask import current_app, url_for
|
||||
|
||||
import matplotlib
|
||||
matplotlib.use('Agg')
|
||||
|
||||
class PlotTopActiveUsers(BasePlotAnalysis):
|
||||
"""
|
||||
Class for analyzing the most active users and generating a bar chart.
|
||||
|
||||
Attributes:
|
||||
name (str): The name of the analysis.
|
||||
description (str): A brief description of the analysis.
|
||||
plot_filename (str): The filename for the output plot.
|
||||
note (str): Additional notes for the analysis.
|
||||
"""
|
||||
name = "Top Active Users"
|
||||
description = "Displays the most active users based on their number of recorded actions."
|
||||
plot_filename = "bar_activity-per-user.png"
|
||||
note = ""
|
||||
|
||||
def transform_data(self, df: pd.DataFrame) -> pd.DataFrame:
|
||||
"""
|
||||
Transform data for the bar plot.
|
||||
|
||||
Parameters:
|
||||
df (pd.DataFrame): The input DataFrame containing user activity data.
|
||||
|
||||
Returns:
|
||||
pd.DataFrame: The transformed DataFrame with active counts per user.
|
||||
"""
|
||||
df = df[df['was_active'] == True].groupby('name').size().reset_index(name='active_count')
|
||||
return df
|
||||
|
||||
def plot_data(self, df: pd.DataFrame):
|
||||
"""
|
||||
Generate bar plot.
|
||||
|
||||
Parameters:
|
||||
df (pd.DataFrame): The transformed DataFrame containing active counts per user.
|
||||
"""
|
||||
# create a barplot from active counts sorted by active count
|
||||
plt.figure(figsize=(10, 6))
|
||||
sns.barplot(x='active_count', y='name', data=df.sort_values('active_count', ascending=False))
|
||||
plt.xticks(rotation=90)
|
||||
plt.title('Minutes Active')
|
||||
plt.xlabel('Player')
|
||||
plt.ylabel('Active Count')
|
||||
53
app/analysis/plot_bar_peak_hours.py
Normal file
53
app/analysis/plot_bar_peak_hours.py
Normal file
@@ -0,0 +1,53 @@
|
||||
import os
|
||||
import pandas as pd
|
||||
import matplotlib.pyplot as plt
|
||||
import seaborn as sns
|
||||
from .basePlotAnalysis import BasePlotAnalysis
|
||||
|
||||
import matplotlib
|
||||
matplotlib.use('Agg')
|
||||
|
||||
class PlotPeakHours(BasePlotAnalysis):
|
||||
"""
|
||||
Class for analyzing peak activity hours and generating a bar chart.
|
||||
|
||||
Attributes:
|
||||
name (str): The name of the analysis.
|
||||
description (str): A brief description of the analysis.
|
||||
plot_filename (str): The filename for the output plot.
|
||||
note (str): Additional notes for the analysis.
|
||||
"""
|
||||
|
||||
name = "Peak Hours Analysis"
|
||||
description = "Identifies peak activity hours using a bar chart."
|
||||
plot_filename = "peak_hours.png"
|
||||
note = ""
|
||||
|
||||
def transform_data(self, df: pd.DataFrame) -> pd.DataFrame:
|
||||
"""
|
||||
Transform data to add was_active column and extract peak hours. See data_utils.py.
|
||||
|
||||
Parameters:
|
||||
df (pd.DataFrame): The input DataFrame containing user activity data.
|
||||
|
||||
Returns:
|
||||
pd.DataFrame: The transformed DataFrame with additional columns for analysis.
|
||||
"""
|
||||
return df
|
||||
|
||||
def plot_data(self, df: pd.DataFrame):
|
||||
"""
|
||||
Generate bar chart for peak hours.
|
||||
|
||||
Parameters:
|
||||
df (pd.DataFrame): The transformed DataFrame containing user activity data.
|
||||
"""
|
||||
peak_hours = df[df["was_active"]]["hour"].value_counts().sort_index()
|
||||
|
||||
plt.figure(figsize=(12, 5))
|
||||
sns.barplot(x=peak_hours.index, y=peak_hours.values, hue=peak_hours.values, palette="coolwarm")
|
||||
|
||||
plt.xlabel("Hour of the Day")
|
||||
plt.ylabel("Activity Count")
|
||||
plt.title("Peak Hours of User Activity")
|
||||
plt.xticks(range(0, 24))
|
||||
55
app/analysis/plot_heat_user-activity-hour.py
Normal file
55
app/analysis/plot_heat_user-activity-hour.py
Normal file
@@ -0,0 +1,55 @@
|
||||
import pandas as pd
|
||||
import matplotlib.pyplot as plt
|
||||
import seaborn as sns
|
||||
from .basePlotAnalysis import BasePlotAnalysis
|
||||
|
||||
import matplotlib
|
||||
matplotlib.use('Agg')
|
||||
|
||||
class PlotActivityHeatmap(BasePlotAnalysis):
|
||||
"""
|
||||
Class for analyzing user activity trends over multiple days and generating a heatmap.
|
||||
|
||||
Attributes:
|
||||
name (str): The name of the analysis.
|
||||
description (str): A brief description of the analysis.
|
||||
plot_filename (str): The filename for the output plot.
|
||||
note (str): Additional notes for the analysis.
|
||||
"""
|
||||
name = "Activity Heatmap"
|
||||
description = "Displays user activity trends over multiple days using a heatmap. Generates a downloadable PNG image."
|
||||
plot_filename = "activity_heatmap.png"
|
||||
note = ""
|
||||
|
||||
def transform_data(self, df: pd.DataFrame) -> pd.DataFrame:
|
||||
"""
|
||||
Transform data for the heatmap.
|
||||
|
||||
Parameters:
|
||||
df (pd.DataFrame): The input DataFrame containing user activity data.
|
||||
|
||||
Returns:
|
||||
pd.DataFrame: The transformed DataFrame with activity counts by hour.
|
||||
"""
|
||||
active_counts = df[df['was_active']].pivot_table(
|
||||
index='name',
|
||||
columns='hour',
|
||||
values='was_active',
|
||||
aggfunc='sum',
|
||||
fill_value=0
|
||||
)
|
||||
active_counts['total_active_minutes'] = active_counts.sum(axis=1)
|
||||
return active_counts.sort_values(by='total_active_minutes', ascending=False)
|
||||
|
||||
def plot_data(self, df: pd.DataFrame):
|
||||
"""
|
||||
Generate heatmap plot.
|
||||
|
||||
Parameters:
|
||||
df (pd.DataFrame): The transformed DataFrame containing activity counts by hour.
|
||||
"""
|
||||
plt.figure(figsize=(12, 8))
|
||||
sns.heatmap(df.loc[:, df.columns != 'total_active_minutes'], cmap='viridis', cbar_kws={'label': 'Count of was_active == True'})
|
||||
plt.xlabel('Hour of Day')
|
||||
plt.ylabel('User ID')
|
||||
plt.title('User Activity Heatmap')
|
||||
67
app/analysis/plot_line_activity-user.py
Normal file
67
app/analysis/plot_line_activity-user.py
Normal file
@@ -0,0 +1,67 @@
|
||||
import pandas as pd
|
||||
import matplotlib.pyplot as plt
|
||||
import seaborn as sns
|
||||
from .basePlotAnalysis import BasePlotAnalysis
|
||||
from flask import current_app, url_for
|
||||
|
||||
import matplotlib
|
||||
matplotlib.use('Agg')
|
||||
|
||||
class PlotLineActivityAllUsers(BasePlotAnalysis):
|
||||
"""
|
||||
Class for analyzing user activity trends over multiple days and generating a line graph.
|
||||
|
||||
Attributes:
|
||||
name (str): The name of the analysis.
|
||||
description (str): A brief description of the analysis.
|
||||
plot_filename (str): The filename for the output plot.
|
||||
note (str): Additional notes for the analysis.
|
||||
"""
|
||||
name = "Activity Line Graph (All Users)"
|
||||
description = "This analysis shows the activity line graph for all users. Gneerates a downloadable PNG image."
|
||||
plot_filename = "line_activity-all_users.png"
|
||||
note = ""
|
||||
|
||||
def transform_data(self, df: pd.DataFrame) -> pd.DataFrame:
|
||||
"""
|
||||
Transform data for the line plot.
|
||||
|
||||
Parameters:
|
||||
df (pd.DataFrame): The input DataFrame containing user activity data.
|
||||
|
||||
Returns:
|
||||
pd.DataFrame: The transformed DataFrame with activity counts by hour.
|
||||
"""
|
||||
df['hour'] = df['timestamp'].dt.hour
|
||||
df = df[df['was_active'] == True].pivot_table(index='name', columns='hour', values='was_active', aggfunc='sum', fill_value=0)
|
||||
df['total_active_minutes'] = df.sum(axis=1)
|
||||
df = df.sort_values(by='total_active_minutes', ascending=False).drop('total_active_minutes', axis=1)
|
||||
|
||||
cumulative_sum_row = df.cumsum().iloc[-1]
|
||||
df.loc['Cumulative Sum'] = cumulative_sum_row
|
||||
|
||||
return df
|
||||
|
||||
def plot_data(self, df: pd.DataFrame):
|
||||
"""
|
||||
Generate line graph for user activity throughout the day.
|
||||
|
||||
Parameters:
|
||||
df (pd.DataFrame): The transformed DataFrame containing activity counts by hour.
|
||||
"""
|
||||
plt.figure(figsize=(12, 6))
|
||||
|
||||
# Plot each user's activity
|
||||
for index, row in df.iterrows():
|
||||
if index == 'Cumulative Sum':
|
||||
plt.plot(row.index, row.values, label=index, linewidth=3, color='black') # Bold line for cumulative sum
|
||||
else:
|
||||
plt.plot(row.index, row.values, label=index)
|
||||
|
||||
# Add labels and title
|
||||
plt.xlabel('Hour of Day')
|
||||
plt.ylabel('Activity Count')
|
||||
plt.title('User Activity Throughout the Day')
|
||||
plt.legend(loc='upper left', bbox_to_anchor=(1, 1))
|
||||
|
||||
plt.grid(True)
|
||||
82
app/analysis/plotly_heat_user-activity.py
Normal file
82
app/analysis/plotly_heat_user-activity.py
Normal file
@@ -0,0 +1,82 @@
|
||||
import pandas as pd
|
||||
import plotly.express as px
|
||||
import plotly.graph_objects as go
|
||||
|
||||
from .basePlotlyAnalysis import BasePlotlyAnalysis
|
||||
from flask import current_app, url_for
|
||||
|
||||
class PlotlyActivityHeatmap(BasePlotlyAnalysis):
|
||||
"""
|
||||
Class for analyzing user activity trends over multiple days and generating an interactive heatmap.
|
||||
|
||||
Attributes:
|
||||
name (str): The name of the analysis.
|
||||
description (str): A brief description of the analysis.
|
||||
plot_filename (str): The filename for the output plot.
|
||||
note (str): Additional notes for the analysis.
|
||||
"""
|
||||
name = "Activity Heatmap (Interactive)"
|
||||
description = "Displays user activity trends over multiple days using an interactive heatmap."
|
||||
plot_filename = "activity_heatmap.html"
|
||||
note = ""
|
||||
|
||||
def transform_data(self, df: pd.DataFrame) -> pd.DataFrame:
|
||||
"""
|
||||
Transform data for the heatmap.
|
||||
|
||||
Parameters:
|
||||
df (pd.DataFrame): The input DataFrame containing user activity data.
|
||||
|
||||
Returns:
|
||||
pd.DataFrame: The transformed DataFrame with activity counts by hour.
|
||||
"""
|
||||
df['hour'] = df['timestamp'].dt.hour
|
||||
active_counts = df[df['was_active']].pivot_table(
|
||||
index='name',
|
||||
columns='hour',
|
||||
values='was_active',
|
||||
aggfunc='sum',
|
||||
fill_value=0
|
||||
).reset_index()
|
||||
|
||||
# Ensure all hours are represented
|
||||
all_hours = pd.DataFrame({'hour': range(24)})
|
||||
active_counts = active_counts.melt(id_vars='name', var_name='hour', value_name='activity_count')
|
||||
active_counts = active_counts.merge(all_hours, on='hour', how='right').fillna(0)
|
||||
active_counts['hour'] = active_counts['hour'].astype(int) # Ensure hour is treated as numeric
|
||||
return active_counts
|
||||
|
||||
def plot_data(self, df: pd.DataFrame):
|
||||
"""
|
||||
Generate heatmap plot.
|
||||
|
||||
Parameters:
|
||||
df (pd.DataFrame): The transformed DataFrame containing activity counts by hour.
|
||||
"""
|
||||
df = df.pivot(index='name', columns='hour', values='activity_count').fillna(0)
|
||||
|
||||
# Create a Plotly heatmap
|
||||
self.fig = go.Figure(data=go.Heatmap(
|
||||
z=df.values,
|
||||
x=df.columns,
|
||||
y=df.index,
|
||||
colorscale='Viridis',
|
||||
colorbar=dict(title='Count of was_active == True')
|
||||
))
|
||||
|
||||
# Update layout
|
||||
self.fig.update_layout(
|
||||
title='User Activity Heatmap',
|
||||
xaxis_title='Hour of Day',
|
||||
yaxis_title='User ID',
|
||||
xaxis=dict(tickmode='linear', dtick=1, range=[0, 23]), # Ensure x-axis covers all hours
|
||||
template='plotly_white'
|
||||
)
|
||||
|
||||
self.fig.update_traces(
|
||||
hovertemplate="<br>".join([
|
||||
"Hour: %{x}",
|
||||
"Name: %{y}",
|
||||
"Activity: %{z}",
|
||||
])
|
||||
)
|
||||
65
app/analysis/plotly_line_activity-user.py
Normal file
65
app/analysis/plotly_line_activity-user.py
Normal file
@@ -0,0 +1,65 @@
|
||||
import pandas as pd
|
||||
import plotly.graph_objects as go
|
||||
from plotly.subplots import make_subplots
|
||||
from .basePlotlyAnalysis import BasePlotlyAnalysis
|
||||
from flask import current_app, url_for
|
||||
|
||||
class PlotlyLineActivityAllUsers(BasePlotlyAnalysis):
|
||||
"""
|
||||
Class for analyzing user activity trends over multiple days and generating an interactive line graph.
|
||||
|
||||
Attributes:
|
||||
name (str): The name of the analysis.
|
||||
description (str): A brief description of the analysis.
|
||||
plot_filename (str): The filename for the output plot.
|
||||
note (str): Additional notes for the analysis.
|
||||
"""
|
||||
name = "Activity Line Graph (All Users, Interactive)"
|
||||
description = "This analysis shows the activity line graph for all users. The graph is interactive and can be used to explore the data."
|
||||
plot_filename = "line_activity-all_users.html"
|
||||
note = ""
|
||||
|
||||
def transform_data(self, df: pd.DataFrame) -> pd.DataFrame:
|
||||
"""
|
||||
Transform data for the line plot.
|
||||
|
||||
Parameters:
|
||||
df (pd.DataFrame): The input DataFrame containing user activity data.
|
||||
|
||||
Returns:
|
||||
pd.DataFrame: The transformed DataFrame with activity counts by hour.
|
||||
"""
|
||||
df['hour'] = df['timestamp'].dt.hour
|
||||
df = df[df['was_active'] == True].pivot_table(index='name', columns='hour', values='was_active', aggfunc='sum', fill_value=0)
|
||||
df['total_active_minutes'] = df.sum(axis=1)
|
||||
df = df.sort_values(by='total_active_minutes', ascending=False).drop('total_active_minutes', axis=1)
|
||||
|
||||
cumulative_sum_row = df.cumsum().iloc[-1]
|
||||
df.loc['Cumulative Sum'] = cumulative_sum_row
|
||||
|
||||
return df
|
||||
|
||||
def plot_data(self, df: pd.DataFrame):
|
||||
"""
|
||||
Generate interactive line graph for user activity throughout the day.
|
||||
|
||||
Parameters:
|
||||
df (pd.DataFrame): The transformed DataFrame containing activity counts by hour.
|
||||
"""
|
||||
self.fig = make_subplots()
|
||||
|
||||
# Plot each user's activity
|
||||
for index, row in df.iterrows():
|
||||
if index == 'Cumulative Sum':
|
||||
self.fig.add_trace(go.Scatter(x=row.index, y=row.values, mode='lines', name=index, line=dict(width=3, color='black'))) # Bold line for cumulative sum
|
||||
else:
|
||||
self.fig.add_trace(go.Scatter(x=row.index, y=row.values, mode='lines', name=index))
|
||||
|
||||
self.fig.update_layout(
|
||||
title='User Activity Throughout the Day',
|
||||
xaxis_title='Hour of Day',
|
||||
yaxis_title='Activity Count',
|
||||
legend_title='User',
|
||||
legend=dict(x=1, y=1),
|
||||
template='plotly_white'
|
||||
)
|
||||
31
app/analysis/table_statistics.py
Normal file
31
app/analysis/table_statistics.py
Normal file
@@ -0,0 +1,31 @@
|
||||
import pandas as pd
|
||||
from .base import BaseAnalysis
|
||||
from flask import render_template_string
|
||||
|
||||
class GenerateStatistics(BaseAnalysis):
|
||||
name = "Test Statistics (Placeholder)"
|
||||
description = "Generates activity statistics grouped by hour."
|
||||
|
||||
def execute(self, df: pd.DataFrame):
|
||||
df["hour"] = df["timestamp"].dt.hour
|
||||
statistics = df.groupby("hour").size().reset_index(name="count")
|
||||
|
||||
# Convert statistics DataFrame to HTML
|
||||
table_html = statistics.to_html(classes="table table-bordered table-striped")
|
||||
|
||||
# Wrap it in Bootstrap styling
|
||||
html_content = render_template_string(
|
||||
"""
|
||||
<div class="card mt-3">
|
||||
<div class="card-header">
|
||||
<h4>Activity Statistics</h4>
|
||||
</div>
|
||||
<div class="card-body">
|
||||
{{ table_html | safe }}
|
||||
</div>
|
||||
</div>
|
||||
""",
|
||||
table_html=table_html
|
||||
)
|
||||
|
||||
return html_content
|
||||
116
app/api.py
116
app/api.py
@@ -1,4 +1,3 @@
|
||||
# filepath: /home/michaelb/Dokumente/TornActivityTracker/app/api.py
|
||||
from flask import jsonify, request, Response, send_from_directory, current_app
|
||||
import threading
|
||||
import os
|
||||
@@ -6,15 +5,11 @@ import glob
|
||||
from datetime import datetime
|
||||
import pandas as pd
|
||||
|
||||
from app.models import Scraper, generate_statistics
|
||||
from app.util import create_zip, delete_old_zips, tail, get_size
|
||||
from app.models import Scraper
|
||||
from app.util import create_zip, delete_old_zips, tail
|
||||
from app.config import load_config
|
||||
from app.logging_config import get_logger
|
||||
from app.forms import ScrapingForm
|
||||
|
||||
config = load_config()
|
||||
logger = get_logger()
|
||||
log_file_name = logger.handlers[0].baseFilename
|
||||
from app.tasks import start_scraping_task, stop_scraping_task, get_redis
|
||||
|
||||
scraping_thread = None
|
||||
scraper = None
|
||||
@@ -23,52 +18,53 @@ scrape_lock = threading.Lock()
|
||||
def register_api(app):
|
||||
@app.route('/start_scraping', methods=['POST'])
|
||||
def start_scraping():
|
||||
with scrape_lock:
|
||||
scraper = current_app.config.get('SCRAPER')
|
||||
if scraper is not None and scraper.scraping_active:
|
||||
logger.warning("Can't start scraping process: scraping already in progress")
|
||||
return jsonify({"status": "Scraping already in progress"})
|
||||
|
||||
form = ScrapingForm()
|
||||
if form.validate_on_submit():
|
||||
redis_client = get_redis()
|
||||
faction_id = form.faction_id.data
|
||||
fetch_interval = form.fetch_interval.data
|
||||
run_interval = form.run_interval.data
|
||||
|
||||
scraper = Scraper(faction_id, fetch_interval, run_interval, current_app)
|
||||
scraper.scraping_active = True
|
||||
# Check if scraping is already active
|
||||
if redis_client.hget(f"scraper:{faction_id}", "scraping_active") == "1":
|
||||
return jsonify({"status": "Scraping already in progress"})
|
||||
|
||||
scraping_thread = threading.Thread(target=scraper.start_scraping)
|
||||
scraping_thread.daemon = True
|
||||
scraping_thread.start()
|
||||
|
||||
current_app.config['SCRAPER'] = scraper
|
||||
current_app.config['SCRAPING_THREAD'] = scraping_thread
|
||||
# Convert config to a serializable dict with only needed values
|
||||
config_dict = {
|
||||
'DATA': {'DATA_DIR': current_app.config['DATA']['DATA_DIR']},
|
||||
'DEFAULT': {'API_KEY': current_app.config['DEFAULT']['API_KEY']}
|
||||
}
|
||||
|
||||
start_scraping_task.delay(
|
||||
faction_id,
|
||||
int(form.fetch_interval.data), # Ensure this is an int
|
||||
int(form.run_interval.data), # Ensure this is an int
|
||||
config_dict
|
||||
)
|
||||
return jsonify({"status": "Scraping started"})
|
||||
return jsonify({"status": "Invalid form data"})
|
||||
|
||||
@app.route('/stop_scraping', methods=['POST'])
|
||||
def stop_scraping():
|
||||
scraper = current_app.config.get('SCRAPER')
|
||||
if scraper is None or not scraper.scraping_active:
|
||||
return jsonify({"status": "Scraping is not running"})
|
||||
redis_client = get_redis()
|
||||
faction_id = redis_client.get("current_faction_id")
|
||||
if not faction_id:
|
||||
return jsonify({"status": "No active scraping session"})
|
||||
|
||||
stop_scraping_task.delay(faction_id)
|
||||
return jsonify({"status": "Stopping scraping"})
|
||||
|
||||
scraper.stop_scraping()
|
||||
current_app.config['SCRAPING_ACTIVE'] = False
|
||||
logger.debug("Scraping stopped by user")
|
||||
return jsonify({"status": "Scraping stopped"})
|
||||
@app.route('/logfile', methods=['GET'])
|
||||
def logfile():
|
||||
log_file_name = current_app.logger.handlers[0].baseFilename
|
||||
|
||||
page = int(request.args.get('page', 0)) # Page number
|
||||
lines_per_page = int(request.args.get('lines_per_page', config['LOGGING']['VIEW_PAGE_LINES'])) # Lines per page
|
||||
lines_per_page = int(request.args.get('lines_per_page', current_app.config['LOGGING']['VIEW_PAGE_LINES'])) # Lines per page
|
||||
log_file_path = log_file_name # Path to the current log file
|
||||
|
||||
if not os.path.isfile(log_file_path):
|
||||
logger.error("Log file not found")
|
||||
current_app.logger.error("Log file not found")
|
||||
return jsonify({"error": "Log file not found"}), 404
|
||||
|
||||
log_lines = list(tail(log_file_path, config['LOGGING']['VIEW_MAX_LINES']))
|
||||
log_lines = list(tail(log_file_path, current_app.config['LOGGING']['VIEW_MAX_LINES']))
|
||||
|
||||
log_lines = log_lines[::-1] # Reverse the list
|
||||
|
||||
@@ -123,14 +119,15 @@ def register_api(app):
|
||||
|
||||
@app.route('/delete_files', methods=['POST'])
|
||||
def delete_files():
|
||||
log_file_name = current_app.logger.handlers[0].baseFilename
|
||||
file_paths = request.json.get('file_paths', [])
|
||||
|
||||
if not file_paths:
|
||||
return jsonify({"error": "No files specified"}), 400
|
||||
|
||||
errors = []
|
||||
data_dir = os.path.abspath(config['DATA']['DATA_DIR'])
|
||||
log_dir = os.path.abspath(config['LOGGING']['LOG_DIR'])
|
||||
data_dir = os.path.abspath(current_app.config['DATA']['DATA_DIR'])
|
||||
log_dir = os.path.abspath(current_app.config['LOGGING']['LOG_DIR'])
|
||||
|
||||
for file_path in file_paths:
|
||||
if file_path.startswith('/data/'):
|
||||
@@ -171,40 +168,63 @@ def register_api(app):
|
||||
|
||||
@app.route('/data/<path:filename>')
|
||||
def download_data_file(filename):
|
||||
data_dir = os.path.abspath(config['DATA']['DATA_DIR'])
|
||||
data_dir = os.path.abspath(current_app.config['DATA']['DATA_DIR'])
|
||||
file_path = os.path.join(data_dir, filename)
|
||||
|
||||
return send_from_directory(directory=data_dir, path=filename, as_attachment=True)
|
||||
|
||||
@app.route('/log/<path:filename>')
|
||||
def download_log_file(filename):
|
||||
log_dir = os.path.abspath(config['LOGGING']['LOG_DIR'])
|
||||
log_dir = os.path.abspath(current_app.config['LOGGING']['LOG_DIR'])
|
||||
file_path = os.path.join(log_dir, filename)
|
||||
|
||||
return send_from_directory(directory=log_dir, path=filename, as_attachment=True)
|
||||
|
||||
@app.route('/tmp/<path:filename>')
|
||||
def download_tmp_file(filename):
|
||||
tmp_dir = os.path.abspath(config['TEMP']['TEMP_DIR'])
|
||||
tmp_dir = os.path.abspath(current_app.config['TEMP']['TEMP_DIR'])
|
||||
file_path = os.path.join(tmp_dir, filename)
|
||||
|
||||
return send_from_directory(directory=tmp_dir, path=filename, as_attachment=True)
|
||||
|
||||
|
||||
@app.route('/config/lines_per_page')
|
||||
def get_lines_per_page():
|
||||
lines_per_page = config['LOGGING']['VIEW_PAGE_LINES']
|
||||
lines_per_page = current_app.config['LOGGING']['VIEW_PAGE_LINES']
|
||||
return jsonify({"lines_per_page": lines_per_page})
|
||||
|
||||
@app.route('/scraping_status', methods=['GET'])
|
||||
def scraping_status():
|
||||
if scraper is None:
|
||||
logger.debug("Scraper is not initialized.")
|
||||
redis_client = get_redis()
|
||||
current_faction_id = redis_client.get("current_faction_id")
|
||||
|
||||
if not current_faction_id:
|
||||
return jsonify({"scraping_active": False})
|
||||
|
||||
if scraper.scraping_active:
|
||||
logger.debug("Scraping is active.")
|
||||
return jsonify({"scraping_active": True})
|
||||
else:
|
||||
logger.debug("Scraping is not active.")
|
||||
scraping_active = redis_client.hget(f"scraper:{current_faction_id}", "scraping_active")
|
||||
|
||||
# If we have a faction_id but scraping is not active, clean up the stale state
|
||||
if not scraping_active or scraping_active == "0":
|
||||
redis_client.delete("current_faction_id")
|
||||
return jsonify({"scraping_active": False})
|
||||
|
||||
return jsonify({
|
||||
"scraping_active": True,
|
||||
"faction_id": current_faction_id
|
||||
})
|
||||
|
||||
@app.route('/scraping_get_end_time')
|
||||
def scraping_get_end_time():
|
||||
redis_client = get_redis()
|
||||
current_faction_id = redis_client.get("current_faction_id")
|
||||
|
||||
if not current_faction_id:
|
||||
return jsonify({"scraping_active": False})
|
||||
|
||||
end_time = redis_client.hget(f"scraper:{current_faction_id}", "end_time")
|
||||
if not end_time:
|
||||
return jsonify({"scraping_active": False})
|
||||
|
||||
return jsonify({
|
||||
"end_time": end_time,
|
||||
"faction_id": current_faction_id
|
||||
})
|
||||
|
||||
42
app/app.py
42
app/app.py
@@ -1,42 +0,0 @@
|
||||
from flask import Flask
|
||||
from flask_bootstrap import Bootstrap5
|
||||
from datetime import datetime
|
||||
|
||||
from app.views import register_views
|
||||
from app.api import register_api
|
||||
from app.config import load_config
|
||||
from app.filters import register_filters
|
||||
from app.analysis import generate_statistics
|
||||
|
||||
def init_app():
|
||||
config = load_config()
|
||||
|
||||
# Initialize app
|
||||
app = Flask(__name__)
|
||||
|
||||
# Load configuration
|
||||
app.config['SECRET_KEY'] = config['DEFAULT']['SECRET_KEY']
|
||||
app.config['API_KEY'] = config['DEFAULT']['API_KEY']
|
||||
|
||||
app.config['DATA'] = config['DATA']
|
||||
app.config['TEMP'] = config['TEMP']
|
||||
app.config['LOGGING'] = config['LOGGING']
|
||||
|
||||
# Move bootstrap settings to root level
|
||||
for key in config['BOOTSTRAP']:
|
||||
app.config[key.upper()] = config['BOOTSTRAP'][key]
|
||||
|
||||
bootstrap = Bootstrap5(app)
|
||||
|
||||
# Initialize global variables
|
||||
app.config['SCRAPING_ACTIVE'] = False
|
||||
app.config['SCRAPING_THREAD'] = None
|
||||
app.config['DATA_FILE_NAME'] = None
|
||||
app.config['LOG_FILE_NAME'] = "log/" + datetime.now().strftime('%Y-%m-%d-%H-%M') + '.log'
|
||||
|
||||
# Register routes
|
||||
register_views(app)
|
||||
register_api(app)
|
||||
register_filters(app)
|
||||
|
||||
return app
|
||||
@@ -1,7 +1,8 @@
|
||||
import configparser
|
||||
from configobj import ConfigObj
|
||||
import os
|
||||
|
||||
def load_config():
|
||||
config = configparser.ConfigParser()
|
||||
config.read(os.path.join(os.path.dirname(__file__), '..', 'config.ini'))
|
||||
return config
|
||||
config_path = os.path.join(os.path.dirname(__file__), '..', 'config.ini')
|
||||
|
||||
# Load config while preserving sections as nested dicts
|
||||
return ConfigObj(config_path)
|
||||
|
||||
@@ -4,4 +4,12 @@ from datetime import datetime
|
||||
def register_filters(app):
|
||||
@app.template_filter('datetimeformat')
|
||||
def datetimeformat(value):
|
||||
return datetime.fromtimestamp(value).strftime('%Y-%m-%d %H:%M:%S')
|
||||
"""Convert datetime or timestamp to formatted string"""
|
||||
if isinstance(value, datetime):
|
||||
dt = value
|
||||
else:
|
||||
try:
|
||||
dt = datetime.fromtimestamp(float(value))
|
||||
except (ValueError, TypeError):
|
||||
return str(value)
|
||||
return dt.strftime('%Y-%m-%d %H:%M:%S')
|
||||
@@ -4,23 +4,19 @@ from queue import Queue
|
||||
import os
|
||||
from datetime import datetime
|
||||
|
||||
from app.config import load_config
|
||||
from flask import current_app
|
||||
|
||||
config = load_config()
|
||||
def init_logger(config):
|
||||
LOG_DIR = config.get('LOGGING', {}).get('LOG_DIR', 'log')
|
||||
|
||||
# Define the log directory and ensure it exists
|
||||
LOG_DIR = config['LOGGING']['LOG_DIR']
|
||||
if not os.path.exists(LOG_DIR):
|
||||
os.makedirs(LOG_DIR)
|
||||
|
||||
# Generate the log filename dynamically
|
||||
log_file_name = os.path.join(LOG_DIR, datetime.now().strftime('%Y-%m-%d-%H-%M') + '.log')
|
||||
|
||||
# Initialize the logger
|
||||
logger = logging.getLogger(__name__)
|
||||
logger.setLevel(logging.DEBUG)
|
||||
|
||||
# File handler
|
||||
file_handler = logging.FileHandler(log_file_name, mode='w')
|
||||
file_handler.setLevel(logging.DEBUG)
|
||||
formatter = logging.Formatter('%(asctime)s - %(levelname)s: %(message)s',
|
||||
@@ -28,12 +24,11 @@ formatter = logging.Formatter('%(asctime)s - %(levelname)s: %(message)s',
|
||||
file_handler.setFormatter(formatter)
|
||||
logger.addHandler(file_handler)
|
||||
|
||||
# Queue handler for real-time logging
|
||||
log_queue = Queue()
|
||||
queue_handler = QueueHandler(log_queue)
|
||||
queue_handler.setLevel(logging.DEBUG)
|
||||
logger.addHandler(queue_handler)
|
||||
|
||||
# Function to get logger in other modules
|
||||
def get_logger():
|
||||
logger.debug("Logger initialized")
|
||||
|
||||
return logger
|
||||
130
app/models.py
130
app/models.py
@@ -5,37 +5,71 @@ import os
|
||||
import time
|
||||
from datetime import datetime, timedelta
|
||||
from requests.exceptions import ConnectionError, Timeout, RequestException
|
||||
import redis
|
||||
import threading
|
||||
|
||||
from app.logging_config import get_logger
|
||||
|
||||
from app.config import load_config
|
||||
|
||||
config = load_config()
|
||||
API_KEY = config['DEFAULT']['API_KEY']
|
||||
|
||||
logger = get_logger()
|
||||
from flask import current_app
|
||||
|
||||
class Scraper:
|
||||
def __init__(self, faction_id, fetch_interval, run_interval, app):
|
||||
_instances = {} # Track all instances by faction_id
|
||||
_lock = threading.Lock()
|
||||
|
||||
def __new__(cls, faction_id, *args, **kwargs):
|
||||
with cls._lock:
|
||||
# Stop any existing instance for this faction
|
||||
if faction_id in cls._instances:
|
||||
old_instance = cls._instances[faction_id]
|
||||
old_instance.stop_scraping()
|
||||
|
||||
instance = super().__new__(cls)
|
||||
cls._instances[faction_id] = instance
|
||||
return instance
|
||||
|
||||
def __init__(self, faction_id, fetch_interval, run_interval, config):
|
||||
# Only initialize if not already initialized
|
||||
if not hasattr(self, 'faction_id'):
|
||||
self.redis_client = redis.StrictRedis(
|
||||
host='localhost', port=6379, db=0, decode_responses=True
|
||||
)
|
||||
self.faction_id = faction_id
|
||||
self.fetch_interval = fetch_interval
|
||||
self.run_interval = run_interval
|
||||
self.end_time = datetime.now() + timedelta(days=run_interval)
|
||||
self.data_file_name = os.path.join(app.config['DATA']['DATA_DIR'], f"{self.faction_id}-{datetime.now().strftime('%Y-%m-%d-%H-%M')}.csv")
|
||||
self.scraping_active = False
|
||||
self.API_KEY = config['DEFAULT']['API_KEY']
|
||||
self.data_file_name = os.path.join(
|
||||
config['DATA']['DATA_DIR'],
|
||||
f"{faction_id}-{datetime.now().strftime('%Y-%m-%d-%H-%M')}.csv"
|
||||
)
|
||||
self.end_time = datetime.now() + timedelta(days=int(run_interval))
|
||||
|
||||
print(self.data_file_name)
|
||||
# Store scraper state in Redis
|
||||
self.redis_client.hmset(f"scraper:{faction_id}", {
|
||||
"faction_id": faction_id,
|
||||
"fetch_interval": fetch_interval,
|
||||
"run_interval": run_interval,
|
||||
"end_time": self.end_time.isoformat(),
|
||||
"data_file_name": self.data_file_name,
|
||||
"scraping_active": "0",
|
||||
"api_key": self.API_KEY
|
||||
})
|
||||
|
||||
@property
|
||||
def scraping_active(self):
|
||||
return bool(int(self.redis_client.hget(f"scraper:{self.faction_id}", "scraping_active")))
|
||||
|
||||
@scraping_active.setter
|
||||
def scraping_active(self, value):
|
||||
self.redis_client.hset(f"scraper:{self.faction_id}", "scraping_active", "1" if value else "0")
|
||||
|
||||
def fetch_faction_data(self):
|
||||
url = f"https://api.torn.com/faction/{self.faction_id}?selections=&key={API_KEY}"
|
||||
url = f"https://api.torn.com/faction/{self.faction_id}?selections=&key={self.API_KEY}"
|
||||
response = requests.get(url)
|
||||
if response.status_code == 200:
|
||||
return response.json()
|
||||
logger.warning(f"Failed to fetch faction data for faction ID {self.faction_id}. Response: {response.text}")
|
||||
current_app.logger.warning(f"Failed to fetch faction data for faction ID {self.faction_id}. Response: {response.text}")
|
||||
return None
|
||||
|
||||
def fetch_user_activity(self, user_id):
|
||||
url = f"https://api.torn.com/user/{user_id}?selections=basic,profile&key={API_KEY}"
|
||||
url = f"https://api.torn.com/user/{user_id}?selections=basic,profile&key={self.API_KEY}"
|
||||
retries = 3
|
||||
for attempt in range(retries):
|
||||
try:
|
||||
@@ -43,46 +77,50 @@ class Scraper:
|
||||
response.raise_for_status()
|
||||
return response.json()
|
||||
except ConnectionError as e:
|
||||
logger.error(f"Connection error while fetching user activity for user ID {user_id}: {e}")
|
||||
current_app.logger.error(f"Connection error while fetching user activity for user ID {user_id}: {e}")
|
||||
except Timeout as e:
|
||||
logger.error(f"Timeout error while fetching user activity for user ID {user_id}: {e}")
|
||||
current_app.logger.error(f"Timeout error while fetching user activity for user ID {user_id}: {e}")
|
||||
except RequestException as e:
|
||||
logger.error(f"Error while fetching user activity for user ID {user_id}: {e}")
|
||||
current_app.logger.error(f"Error while fetching user activity for user ID {user_id}: {e}")
|
||||
if attempt < retries - 1:
|
||||
current_app.logger.debug(f"Retrying {attempt + 1}/{retries} for user {user_id}")
|
||||
time.sleep(2 ** attempt) # Exponential backoff
|
||||
return None
|
||||
|
||||
def start_scraping(self) -> None:
|
||||
"""Starts the scraping process until the end time is reached or stopped manually."""
|
||||
self.scraping_active = True
|
||||
logger.info(f"Starting scraping for faction ID {self.faction_id}")
|
||||
logger.debug(f"Fetch interval: {self.fetch_interval}s, Run interval: {self.run_interval} days, End time: {self.end_time}")
|
||||
|
||||
MAX_FAILURES = 5 # Stop after 5 consecutive failures
|
||||
current_app.logger.info(f"Starting scraping for faction ID {self.faction_id}")
|
||||
current_app.logger.debug(f"Fetch interval: {self.fetch_interval}s, Run interval: {self.run_interval} days, End time: {self.end_time}")
|
||||
|
||||
MAX_FAILURES = 5
|
||||
failure_count = 0
|
||||
|
||||
while datetime.now() < self.end_time and self.scraping_active:
|
||||
logger.info(f"Fetching data at {datetime.now()}")
|
||||
current_app.logger.info(f"Fetching data at {datetime.now()}")
|
||||
faction_data = self.fetch_faction_data()
|
||||
|
||||
if not faction_data or "members" not in faction_data:
|
||||
logger.warning(f"No faction data found for ID {self.faction_id} (Failure {failure_count + 1}/{MAX_FAILURES})")
|
||||
current_app.logger.warning(f"No faction data found for ID {self.faction_id} (Failure {failure_count + 1}/{MAX_FAILURES})")
|
||||
failure_count += 1
|
||||
if failure_count >= MAX_FAILURES:
|
||||
logger.error(f"Max failures reached ({MAX_FAILURES}). Stopping scraping.")
|
||||
current_app.logger.error(f"Max failures reached ({MAX_FAILURES}). Stopping scraping.")
|
||||
break
|
||||
time.sleep(self.fetch_interval)
|
||||
continue
|
||||
|
||||
current_app.logger.info(f"Fetched {len(faction_data['members'])} members for faction {self.faction_id}")
|
||||
failure_count = 0 # Reset failure count on success
|
||||
user_activity_data = self.process_faction_members(faction_data["members"])
|
||||
self.save_data(user_activity_data)
|
||||
|
||||
logger.info(f"Data appended to {self.data_file_name}")
|
||||
current_app.logger.info(f"Data appended to {self.data_file_name}")
|
||||
time.sleep(self.fetch_interval)
|
||||
|
||||
self.handle_scraping_end()
|
||||
|
||||
|
||||
def process_faction_members(self, members: Dict[str, Dict]) -> List[Dict]:
|
||||
"""Processes and retrieves user activity for all faction members."""
|
||||
user_activity_data = []
|
||||
@@ -96,16 +134,16 @@ class Scraper:
|
||||
"status": user_activity.get("status", {}).get("state", ""),
|
||||
"timestamp": datetime.now().timestamp(),
|
||||
})
|
||||
logger.info(f"Fetched data for user {user_id} ({user_activity.get('name', '')})")
|
||||
current_app.logger.info(f"Fetched data for user {user_id} ({user_activity.get('name', '')})")
|
||||
else:
|
||||
logger.warning(f"Failed to fetch data for user {user_id}")
|
||||
current_app.logger.warning(f"Failed to fetch data for user {user_id}")
|
||||
|
||||
return user_activity_data
|
||||
|
||||
def save_data(self, user_activity_data: List[Dict]) -> None:
|
||||
"""Saves user activity data to a CSV file."""
|
||||
if not user_activity_data:
|
||||
logger.warning("No data to save.")
|
||||
current_app.logger.warning("No data to save.")
|
||||
return
|
||||
|
||||
df = pd.DataFrame(user_activity_data)
|
||||
@@ -117,26 +155,40 @@ class Scraper:
|
||||
try:
|
||||
with open(self.data_file_name, "a" if file_exists else "w") as f:
|
||||
df.to_csv(f, mode="a" if file_exists else "w", header=not file_exists, index=False)
|
||||
logger.info(f"Data successfully saved to {self.data_file_name}")
|
||||
current_app.logger.info(f"Data successfully saved to {self.data_file_name}")
|
||||
except Exception as e:
|
||||
logger.error(f"Error saving data to {self.data_file_name}: {e}")
|
||||
current_app.logger.error(f"Error saving data to {self.data_file_name}: {e}")
|
||||
|
||||
def cleanup_redis_state(self):
|
||||
"""Clean up all Redis state for this scraper instance"""
|
||||
if hasattr(self, 'faction_id'):
|
||||
self.redis_client.delete(f"scraper:{self.faction_id}")
|
||||
current_id = self.redis_client.get("current_faction_id")
|
||||
if current_id and current_id == str(self.faction_id):
|
||||
self.redis_client.delete("current_faction_id")
|
||||
# Remove from instances tracking
|
||||
with self._lock:
|
||||
if self.faction_id in self._instances:
|
||||
del self._instances[self.faction_id]
|
||||
|
||||
def handle_scraping_end(self) -> None:
|
||||
"""Handles cleanup and logging when scraping ends."""
|
||||
if not self.scraping_active:
|
||||
logger.warning(f"Scraping stopped manually at {datetime.now()}")
|
||||
current_app.logger.warning(f"Scraping stopped manually at {datetime.now()}")
|
||||
elif datetime.now() >= self.end_time:
|
||||
logger.warning(f"Scraping stopped due to timeout at {datetime.now()} (Run interval: {self.run_interval} days)")
|
||||
current_app.logger.warning(f"Scraping stopped due to timeout at {datetime.now()} (Run interval: {self.run_interval} days)")
|
||||
else:
|
||||
logger.error(f"Unexpected stop at {datetime.now()}")
|
||||
current_app.logger.error(f"Unexpected stop at {datetime.now()}")
|
||||
|
||||
logger.info("Scraping completed.")
|
||||
current_app.logger.info("Scraping completed.")
|
||||
self.scraping_active = False
|
||||
self.cleanup_redis_state()
|
||||
|
||||
def stop_scraping(self):
|
||||
self.scraping_active = False
|
||||
logger.debug("Scraping stopped by user")
|
||||
self.cleanup_redis_state()
|
||||
current_app.logger.debug(f"Scraping stopped for faction {self.faction_id}")
|
||||
|
||||
def generate_statistics(df):
|
||||
df['hour'] = df['timestamp'].dt.hour # No need to convert timestamp again
|
||||
return df.groupby('hour').size() # Activity by hour
|
||||
def __del__(self):
|
||||
"""Ensure Redis cleanup on object destruction"""
|
||||
self.cleanup_redis_state()
|
||||
@@ -1,2 +0,0 @@
|
||||
data_file_name = None
|
||||
log_file_name = None
|
||||
38
app/static/common.js
Normal file
38
app/static/common.js
Normal file
@@ -0,0 +1,38 @@
|
||||
import { ScraperUtils } from './scraper_utils.js';
|
||||
|
||||
class Common {
|
||||
constructor() {
|
||||
this.utils = new ScraperUtils();
|
||||
this.addEventListeners();
|
||||
this.scheduleUpdates();
|
||||
}
|
||||
|
||||
scheduleUpdates() {
|
||||
// Ensure server time updates every minute but only after initial fetch
|
||||
setTimeout(() => {
|
||||
setInterval(() => this.utils.updateServerTime(), 60000);
|
||||
}, 5000); // Delay first scheduled update to prevent duplicate initial request
|
||||
}
|
||||
|
||||
addEventListeners() {
|
||||
if (this.utils.stopButton) {
|
||||
this.utils.stopButton.addEventListener('click', () => this.utils.checkScrapingStatus());
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
document.addEventListener('DOMContentLoaded', () => {
|
||||
new Common();
|
||||
});
|
||||
|
||||
window.checkAllCheckboxes = function(tableId, checkAllId) {
|
||||
var table = document.getElementById(tableId);
|
||||
var checkAll = document.getElementById(checkAllId);
|
||||
var checkboxes = table.querySelectorAll('input[type="checkbox"]');
|
||||
|
||||
checkboxes.forEach(function(checkbox) {
|
||||
if (!checkbox.disabled) {
|
||||
checkbox.checked = checkAll.checked;
|
||||
}
|
||||
});
|
||||
};
|
||||
@@ -94,11 +94,3 @@ function sortTable(columnIndex, tableId) {
|
||||
// Reinsert sorted rows
|
||||
rows.forEach(row => tbody.appendChild(row));
|
||||
}
|
||||
|
||||
function checkAllCheckboxes(tableId, checkAllCheckboxId) {
|
||||
const table = document.getElementById(tableId);
|
||||
const checkboxes = table.querySelectorAll('input[name="fileCheckbox"]');
|
||||
const checkAllCheckbox = document.getElementById(checkAllCheckboxId);
|
||||
|
||||
checkboxes.forEach(checkbox => checkbox.checked = checkAllCheckbox.checked);
|
||||
}
|
||||
|
||||
@@ -1,91 +1,21 @@
|
||||
class LogScraperApp {
|
||||
import { ScraperUtils } from './scraper_utils.js';
|
||||
|
||||
class ScraperApp {
|
||||
constructor() {
|
||||
this.utils = new ScraperUtils();
|
||||
this.form = document.getElementById('scrapingForm');
|
||||
this.stopButton = document.getElementById('stopButton');
|
||||
this.logsElement = document.getElementById('logs');
|
||||
this.prevPageButton = document.getElementById('prevPage');
|
||||
this.nextPageButton = document.getElementById('nextPage');
|
||||
this.pageInfo = document.getElementById('pageInfo');
|
||||
this.startButton = document.getElementById('startButton');
|
||||
|
||||
this.currentPage = 0;
|
||||
this.linesPerPage = null;
|
||||
this.autoRefreshInterval = null;
|
||||
|
||||
this.init();
|
||||
}
|
||||
|
||||
async init() {
|
||||
await this.fetchConfig();
|
||||
await this.checkScrapingStatus();
|
||||
init() {
|
||||
this.utils.checkScrapingStatus();
|
||||
this.addEventListeners();
|
||||
}
|
||||
|
||||
async fetchConfig() {
|
||||
try {
|
||||
const response = await fetch('/config/lines_per_page');
|
||||
const data = await response.json();
|
||||
this.linesPerPage = data.lines_per_page;
|
||||
this.fetchLogs(this.currentPage);
|
||||
} catch (error) {
|
||||
console.error('Error fetching config:', error);
|
||||
}
|
||||
}
|
||||
|
||||
async fetchLogs(page) {
|
||||
try {
|
||||
const response = await fetch(`/logfile?page=${page}&lines_per_page=${this.linesPerPage}`);
|
||||
const data = await response.json();
|
||||
|
||||
if (data.error) {
|
||||
this.logsElement.textContent = data.error;
|
||||
} else {
|
||||
this.logsElement.innerHTML = data.log.map((line, index) => {
|
||||
const lineNumber = data.start_line - index;
|
||||
return `<span class="line-number">${lineNumber}</span> ${line}`;
|
||||
}).join('');
|
||||
|
||||
this.updatePagination(data.total_lines);
|
||||
}
|
||||
} catch (error) {
|
||||
console.error('Error fetching logs:', error);
|
||||
}
|
||||
}
|
||||
|
||||
updatePagination(totalLines) {
|
||||
this.prevPageButton.disabled = this.currentPage === 0;
|
||||
this.nextPageButton.disabled = (this.currentPage + 1) * this.linesPerPage >= totalLines;
|
||||
this.pageInfo.textContent = `Page ${this.currentPage + 1} of ${Math.ceil(totalLines / this.linesPerPage)}`;
|
||||
}
|
||||
|
||||
startAutoRefresh() {
|
||||
this.autoRefreshInterval = setInterval(() => this.fetchLogs(this.currentPage), 5000);
|
||||
}
|
||||
|
||||
stopAutoRefresh() {
|
||||
clearInterval(this.autoRefreshInterval);
|
||||
}
|
||||
|
||||
async checkScrapingStatus() {
|
||||
try {
|
||||
const response = await fetch('/scraping_status');
|
||||
const data = await response.json();
|
||||
if (data.scraping_active) {
|
||||
this.startButton.disabled = true;
|
||||
this.stopButton.disabled = false;
|
||||
this.startAutoRefresh();
|
||||
} else {
|
||||
this.startButton.disabled = false;
|
||||
this.stopButton.disabled = true;
|
||||
}
|
||||
this.fetchLogs(this.currentPage);
|
||||
} catch (error) {
|
||||
console.error('Error checking scraping status:', error);
|
||||
}
|
||||
}
|
||||
|
||||
async startScraping(event) {
|
||||
event.preventDefault();
|
||||
event.preventDefault(); // Prevent default form submission
|
||||
const formData = new FormData(this.form);
|
||||
try {
|
||||
const response = await fetch('/start_scraping', {
|
||||
@@ -93,12 +23,8 @@ class LogScraperApp {
|
||||
body: formData
|
||||
});
|
||||
const data = await response.json();
|
||||
|
||||
console.log(data);
|
||||
if (data.status === "Scraping started") {
|
||||
this.startButton.disabled = true;
|
||||
this.stopButton.disabled = false;
|
||||
this.startAutoRefresh();
|
||||
this.utils.checkScrapingStatus(); // Update UI
|
||||
}
|
||||
} catch (error) {
|
||||
console.error('Error starting scraping:', error);
|
||||
@@ -107,14 +33,12 @@ class LogScraperApp {
|
||||
|
||||
async stopScraping() {
|
||||
try {
|
||||
const response = await fetch('/stop_scraping', { method: 'POST' });
|
||||
const response = await fetch('/stop_scraping', {
|
||||
method: 'POST'
|
||||
});
|
||||
const data = await response.json();
|
||||
|
||||
console.log(data);
|
||||
if (data.status === "Scraping stopped") {
|
||||
this.startButton.disabled = false;
|
||||
this.stopButton.disabled = true;
|
||||
this.stopAutoRefresh();
|
||||
this.utils.checkScrapingStatus(); // Update UI
|
||||
}
|
||||
} catch (error) {
|
||||
console.error('Error stopping scraping:', error);
|
||||
@@ -122,23 +46,11 @@ class LogScraperApp {
|
||||
}
|
||||
|
||||
addEventListeners() {
|
||||
this.prevPageButton.addEventListener('click', () => {
|
||||
if (this.currentPage > 0) {
|
||||
this.currentPage--;
|
||||
this.fetchLogs(this.currentPage);
|
||||
}
|
||||
});
|
||||
|
||||
this.nextPageButton.addEventListener('click', () => {
|
||||
this.currentPage++;
|
||||
this.fetchLogs(this.currentPage);
|
||||
});
|
||||
|
||||
this.form.addEventListener('submit', (event) => this.startScraping(event));
|
||||
|
||||
this.stopButton.addEventListener('click', () => this.stopScraping());
|
||||
}
|
||||
}
|
||||
|
||||
// Initialize the application when DOM is fully loaded
|
||||
document.addEventListener('DOMContentLoaded', () => new LogScraperApp());
|
||||
document.addEventListener('DOMContentLoaded', () => {
|
||||
new ScraperApp();
|
||||
});
|
||||
|
||||
97
app/static/log_viewer.js
Normal file
97
app/static/log_viewer.js
Normal file
@@ -0,0 +1,97 @@
|
||||
class LogViewerApp {
|
||||
constructor() {
|
||||
this.logsElement = document.getElementById('logs');
|
||||
this.prevPageButton = document.getElementById('prevPage');
|
||||
this.nextPageButton = document.getElementById('nextPage');
|
||||
this.pageInfo = document.getElementById('pageInfo');
|
||||
|
||||
this.currentPage = 0;
|
||||
this.linesPerPage = null;
|
||||
this.autoRefreshInterval = null;
|
||||
|
||||
this.init();
|
||||
}
|
||||
|
||||
async init() {
|
||||
await this.fetchConfig();
|
||||
await this.checkScrapingStatus();
|
||||
this.addEventListeners();
|
||||
}
|
||||
|
||||
async fetchConfig() {
|
||||
try {
|
||||
const response = await fetch('/config/lines_per_page');
|
||||
const data = await response.json();
|
||||
this.linesPerPage = data.lines_per_page;
|
||||
this.fetchLogs(this.currentPage);
|
||||
} catch (error) {
|
||||
console.error('Error fetching config:', error);
|
||||
}
|
||||
}
|
||||
|
||||
async fetchLogs(page) {
|
||||
try {
|
||||
const response = await fetch(`/logfile?page=${page}&lines_per_page=${this.linesPerPage}`);
|
||||
const data = await response.json();
|
||||
|
||||
if (data.error) {
|
||||
this.logsElement.textContent = data.error;
|
||||
} else {
|
||||
this.logsElement.innerHTML = data.log.map((line, index) => {
|
||||
const lineNumber = data.start_line - index;
|
||||
return `<span class="line-number">${lineNumber}</span> ${line}`;
|
||||
}).join('');
|
||||
|
||||
this.updatePagination(data.total_lines);
|
||||
}
|
||||
} catch (error) {
|
||||
console.error('Error fetching logs:', error);
|
||||
}
|
||||
}
|
||||
|
||||
updatePagination(totalLines) {
|
||||
this.prevPageButton.disabled = this.currentPage === 0;
|
||||
this.nextPageButton.disabled = (this.currentPage + 1) * this.linesPerPage >= totalLines;
|
||||
this.pageInfo.textContent = `Page ${this.currentPage + 1} of ${Math.ceil(totalLines / this.linesPerPage)}`;
|
||||
}
|
||||
|
||||
startAutoRefresh() {
|
||||
this.autoRefreshInterval = setInterval(() => this.fetchLogs(this.currentPage), 5000);
|
||||
}
|
||||
|
||||
stopAutoRefresh() {
|
||||
clearInterval(this.autoRefreshInterval);
|
||||
}
|
||||
|
||||
async checkScrapingStatus() {
|
||||
try {
|
||||
const response = await fetch('/scraping_status');
|
||||
const data = await response.json();
|
||||
if (data.scraping_active) {
|
||||
this.startAutoRefresh();
|
||||
} else {
|
||||
this.stopAutoRefresh();
|
||||
}
|
||||
this.fetchLogs(this.currentPage);
|
||||
} catch (error) {
|
||||
console.error('Error checking scraping status:', error);
|
||||
}
|
||||
}
|
||||
|
||||
addEventListeners() {
|
||||
this.prevPageButton.addEventListener('click', () => {
|
||||
if (this.currentPage > 0) {
|
||||
this.currentPage--;
|
||||
this.fetchLogs(this.currentPage);
|
||||
}
|
||||
});
|
||||
|
||||
this.nextPageButton.addEventListener('click', () => {
|
||||
this.currentPage++;
|
||||
this.fetchLogs(this.currentPage);
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
// Initialize the application when DOM is fully loaded
|
||||
document.addEventListener('DOMContentLoaded', () => new LogViewerApp());
|
||||
203
app/static/scraper_utils.js
Normal file
203
app/static/scraper_utils.js
Normal file
@@ -0,0 +1,203 @@
|
||||
export class ScraperUtils {
|
||||
constructor() {
|
||||
this.activityIndicator = document.getElementById('activity_indicator');
|
||||
this.endTimeElement = document.getElementById('end_time');
|
||||
this.serverTimeElement = document.getElementById('server_time');
|
||||
this.timeLeftElement = document.getElementById('time-left'); // New element for countdown
|
||||
this.stopButton = document.getElementById('stopButton');
|
||||
this.startButton = document.getElementById('startButton');
|
||||
this.statusContainer = document.getElementById('status_container');
|
||||
this.loadingIndicator = document.getElementById('loading_indicator');
|
||||
this.statusContent = document.querySelectorAll('#status_content');
|
||||
|
||||
this.serverTime = null;
|
||||
this.endTime = null;
|
||||
this.pollInterval = null; // Add this line
|
||||
|
||||
this.init();
|
||||
}
|
||||
|
||||
async init() {
|
||||
this.showLoadingIndicator();
|
||||
|
||||
try {
|
||||
await Promise.all([
|
||||
this.updateServerTime(),
|
||||
this.checkScrapingStatus()
|
||||
]);
|
||||
} catch (error) {
|
||||
console.error("Error during initialization:", error);
|
||||
}
|
||||
|
||||
// Start polling for status updates
|
||||
this.startPolling();
|
||||
|
||||
// Only start the clock and wait for end time if scraping is active
|
||||
if (this.activityIndicator.textContent === 'Active') {
|
||||
if (!this.endTime) {
|
||||
try {
|
||||
await this.fetchEndTime();
|
||||
} catch (error) {
|
||||
console.error("Error fetching end time:", error);
|
||||
}
|
||||
}
|
||||
|
||||
if (this.serverTime && this.endTime) {
|
||||
this.startClock();
|
||||
}
|
||||
}
|
||||
|
||||
// Hide loading indicator regardless of scraping status
|
||||
this.hideLoadingIndicator();
|
||||
}
|
||||
|
||||
startPolling() {
|
||||
// Poll every 2 seconds
|
||||
this.pollInterval = setInterval(async () => {
|
||||
await this.checkScrapingStatus();
|
||||
}, 2000);
|
||||
}
|
||||
|
||||
stopPolling() {
|
||||
if (this.pollInterval) {
|
||||
clearInterval(this.pollInterval);
|
||||
this.pollInterval = null;
|
||||
}
|
||||
}
|
||||
|
||||
showLoadingIndicator() {
|
||||
this.statusContainer.classList.remove('d-none');
|
||||
this.loadingIndicator.classList.remove('d-none');
|
||||
this.statusContent.forEach(element => element.classList.add('d-none'));
|
||||
}
|
||||
|
||||
hideLoadingIndicator() {
|
||||
this.loadingIndicator.classList.add('d-none');
|
||||
this.statusContent.forEach(element => element.classList.remove('d-none'));
|
||||
}
|
||||
|
||||
async checkScrapingStatus() {
|
||||
try {
|
||||
const response = await fetch('/scraping_status');
|
||||
const data = await response.json();
|
||||
|
||||
if (data.scraping_active) {
|
||||
if (this.startButton) this.startButton.disabled = true;
|
||||
if (this.stopButton) this.stopButton.disabled = false;
|
||||
|
||||
this.activityIndicator.classList.remove('text-bg-danger');
|
||||
this.activityIndicator.classList.add('text-bg-success');
|
||||
this.activityIndicator.textContent = 'Active';
|
||||
|
||||
// Fetch end time if we don't have it yet
|
||||
if (!this.endTime) {
|
||||
await this.fetchEndTime();
|
||||
}
|
||||
|
||||
this.endTimeElement.classList.remove('d-none');
|
||||
this.timeLeftElement.classList.remove('d-none');
|
||||
} else {
|
||||
if (this.startButton) this.startButton.disabled = false;
|
||||
if (this.stopButton) this.stopButton.disabled = true;
|
||||
|
||||
this.activityIndicator.classList.remove('text-bg-success');
|
||||
this.activityIndicator.classList.add('text-bg-danger');
|
||||
this.activityIndicator.textContent = 'Inactive';
|
||||
|
||||
this.endTimeElement.classList.add('d-none');
|
||||
this.timeLeftElement.classList.add('d-none');
|
||||
|
||||
// Reset end time when inactive
|
||||
this.endTime = null;
|
||||
}
|
||||
} catch (error) {
|
||||
console.error('Error checking scraping status:', error);
|
||||
}
|
||||
}
|
||||
|
||||
async updateServerTime() {
|
||||
try {
|
||||
const response = await fetch('/server_time');
|
||||
const data = await response.json();
|
||||
this.serverTime = new Date(data.server_time.replace(' ', 'T'));
|
||||
|
||||
this.serverTimeElement.textContent = `Server Time (TCT): ${this.formatDateToHHMMSS(this.serverTime)}`;
|
||||
} catch (error) {
|
||||
console.error('Error fetching server time:', error);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
async fetchEndTime() {
|
||||
if (this.endTime) return;
|
||||
|
||||
try {
|
||||
const response = await fetch('/scraping_get_end_time');
|
||||
const data = await response.json();
|
||||
if (data.end_time) {
|
||||
this.endTime = new Date(data.end_time);
|
||||
this.endTimeElement.textContent = `Running until ${this.formatDateToYYYYMMDDHHMMSS(this.endTime)} TCT`;
|
||||
}
|
||||
} catch (error) {
|
||||
this.endTimeElement.textContent = 'Error fetching end time';
|
||||
console.error('Error fetching end time:', error);
|
||||
}
|
||||
}
|
||||
|
||||
startClock() {
|
||||
const updateClock = () => {
|
||||
if (this.serverTime) {
|
||||
this.serverTime.setSeconds(this.serverTime.getSeconds() + 1);
|
||||
this.serverTimeElement.textContent = `Server Time (TCT): ${this.formatDateToHHMMSS(this.serverTime)}`;
|
||||
}
|
||||
|
||||
if (this.endTime && this.serverTime) {
|
||||
const timeLeft = this.endTime - this.serverTime;
|
||||
this.timeLeftElement.textContent = `Time Left: ${timeLeft > 0 ? this.formatMillisecondsToHHMMSS(timeLeft) : '00:00:00'}`;
|
||||
}
|
||||
};
|
||||
|
||||
// Immediately update the clock
|
||||
updateClock();
|
||||
|
||||
// Continue updating every second
|
||||
setInterval(updateClock, 1000);
|
||||
}
|
||||
|
||||
formatDateToYYYYMMDDHHMMSS(date) {
|
||||
if (!(date instanceof Date) || isNaN(date)) {
|
||||
console.error('Invalid date:', date);
|
||||
return '';
|
||||
}
|
||||
return `${date.getFullYear()}-${String(date.getMonth() + 1).padStart(2, '0')}-${String(date.getDate()).padStart(2, '0')} ` +
|
||||
`${String(date.getHours()).padStart(2, '0')}:${String(date.getMinutes()).padStart(2, '0')}:${String(date.getSeconds()).padStart(2, '0')}`;
|
||||
}
|
||||
|
||||
formatDateToHHMMSS(date) {
|
||||
if (!(date instanceof Date) || isNaN(date)) {
|
||||
console.error('Invalid date:', date);
|
||||
return '';
|
||||
}
|
||||
return `${String(date.getHours()).padStart(2, '0')}:${String(date.getMinutes()).padStart(2, '0')}:${String(date.getSeconds()).padStart(2, '0')}`;
|
||||
}
|
||||
|
||||
formatMillisecondsToHHMMSS(ms) {
|
||||
const totalSeconds = Math.floor(ms / 1000);
|
||||
const hours = Math.floor(totalSeconds / 3600);
|
||||
const minutes = Math.floor((totalSeconds % 3600) / 60);
|
||||
const seconds = totalSeconds % 60;
|
||||
return `${String(hours).padStart(2, '0')}:${String(minutes).padStart(2, '0')}:${String(seconds).padStart(2, '0')}`;
|
||||
}
|
||||
|
||||
// Add cleanup method
|
||||
cleanup() {
|
||||
this.stopPolling();
|
||||
}
|
||||
}
|
||||
|
||||
// Add event listener for page unload
|
||||
window.addEventListener('unload', () => {
|
||||
if (window.scraperUtils) {
|
||||
window.scraperUtils.cleanup();
|
||||
}
|
||||
});
|
||||
93
app/tasks.py
Normal file
93
app/tasks.py
Normal file
@@ -0,0 +1,93 @@
|
||||
from celery import Celery
|
||||
from app.models import Scraper
|
||||
import redis
|
||||
from datetime import timedelta
|
||||
from flask import current_app
|
||||
|
||||
def create_celery():
|
||||
celery = Celery('tasks', broker='redis://localhost:6379/0')
|
||||
celery.conf.update(
|
||||
task_serializer='json',
|
||||
accept_content=['json'],
|
||||
result_serializer='json',
|
||||
timezone='UTC'
|
||||
)
|
||||
return celery
|
||||
|
||||
def init_celery(app):
|
||||
"""Initialize Celery with Flask app context"""
|
||||
celery = create_celery()
|
||||
celery.conf.update(app.config)
|
||||
|
||||
class ContextTask(celery.Task):
|
||||
def __call__(self, *args, **kwargs):
|
||||
with app.app_context():
|
||||
return self.run(*args, **kwargs)
|
||||
|
||||
celery.Task = ContextTask
|
||||
return celery
|
||||
|
||||
celery = create_celery() # This will be initialized properly in app/__init__.py
|
||||
|
||||
def get_redis():
|
||||
return redis.StrictRedis(
|
||||
host='localhost',
|
||||
port=6379,
|
||||
db=0,
|
||||
decode_responses=True
|
||||
)
|
||||
|
||||
@celery.task
|
||||
def start_scraping_task(faction_id, fetch_interval, run_interval, config_dict):
|
||||
"""
|
||||
Start scraping task with serializable parameters
|
||||
Args:
|
||||
faction_id: ID of the faction to scrape
|
||||
fetch_interval: Interval between fetches in seconds
|
||||
run_interval: How long to run the scraper in days
|
||||
config_dict: Dictionary containing configuration
|
||||
"""
|
||||
try:
|
||||
redis_client = get_redis()
|
||||
# Set current faction ID at task start
|
||||
redis_client.set("current_faction_id", str(faction_id))
|
||||
|
||||
scraper = Scraper(
|
||||
faction_id=faction_id,
|
||||
fetch_interval=int(fetch_interval),
|
||||
run_interval=int(run_interval),
|
||||
config=config_dict
|
||||
)
|
||||
scraper.start_scraping()
|
||||
return {"status": "success"}
|
||||
except Exception as e:
|
||||
# Clean up Redis state on error
|
||||
redis_client = get_redis()
|
||||
redis_client.delete("current_faction_id")
|
||||
return {"status": "error", "message": str(e)}
|
||||
|
||||
@celery.task
|
||||
def stop_scraping_task(faction_id):
|
||||
"""Stop scraping task and clean up Redis state"""
|
||||
try:
|
||||
redis_client = get_redis()
|
||||
|
||||
# Clean up Redis state
|
||||
redis_client.hset(f"scraper:{faction_id}", "scraping_active", "0")
|
||||
redis_client.delete(f"scraper:{faction_id}")
|
||||
|
||||
# Clean up current_faction_id if it matches
|
||||
current_id = redis_client.get("current_faction_id")
|
||||
if current_id and current_id == str(faction_id):
|
||||
redis_client.delete("current_faction_id")
|
||||
|
||||
# Revoke any running tasks for this faction
|
||||
celery.control.revoke(
|
||||
celery.current_task.request.id,
|
||||
terminate=True,
|
||||
signal='SIGTERM'
|
||||
)
|
||||
|
||||
return {"status": "success", "message": f"Stopped scraping for faction {faction_id}"}
|
||||
except Exception as e:
|
||||
return {"status": "error", "message": str(e)}
|
||||
@@ -1,16 +1,100 @@
|
||||
{% extends 'base.html' %}
|
||||
|
||||
{% block content %}
|
||||
|
||||
<section class="container-fluid d-flex justify-content-center">
|
||||
<div class="container-md my-5 mx-2 shadow-lg p-4 ">
|
||||
<div class="container-md my-5 mb-3 mx-2 shadow-lg p-4">
|
||||
<div class="container-sm">
|
||||
<div class="row">
|
||||
<div class="col">
|
||||
<h2>Analyze</h2>
|
||||
<h2>User Activity Distribution</h2>
|
||||
</div>
|
||||
</div>
|
||||
<div class="row">
|
||||
<div class="col">
|
||||
<form method="POST" action="{{ url_for('views.analyze') }}">
|
||||
<!-- Dropdown for selecting data file -->
|
||||
<label for="data_file" class="form-label">Choose Data File:</label>
|
||||
<select name="data_file" id="data_file" class="form-select">
|
||||
{% if data_files %}
|
||||
{% for file in data_files %}
|
||||
{{ file }}
|
||||
{{ selected_file }}
|
||||
<option value="{{ file }}" {% if file == selected_file %}selected{% endif %}>{{ file.split('/')[-1] }}</option>
|
||||
{% endfor %}
|
||||
{% else %}
|
||||
<option disabled>No CSV files found</option>
|
||||
{% endif %}
|
||||
</select>
|
||||
|
||||
<!-- Analysis Selection Table -->
|
||||
<label for="analyses" class="form-label">Select Analyses:</label>
|
||||
<table id="analysesTable" class="table table-bordered table-striped">
|
||||
<thead>
|
||||
<tr>
|
||||
<th width="2%"><input type="checkbox" id="checkAllAnalyses" class="form-check-input" onclick="checkAllCheckboxes('analysesTable', 'checkAllAnalyses')"></th>
|
||||
<th>Analysis Name</th>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
{% if analyses %}
|
||||
{% for analysis in analyses %}
|
||||
<tr>
|
||||
<td>
|
||||
<input class="form-check-input" type="checkbox" name="analyses" value="{{ analysis.name }}"
|
||||
{% if analysis.name in selected_analyses %}checked{% endif %}>
|
||||
</td>
|
||||
<td>{{ analysis.name }}</td>
|
||||
<td>{{ analysis.description }}</td>
|
||||
</tr>
|
||||
{% endfor %}
|
||||
{% else %}
|
||||
<tr>
|
||||
<td colspan="3" class="text-center">No analyses available</td>
|
||||
</tr>
|
||||
{% endif %}
|
||||
</tbody>
|
||||
</table>
|
||||
<button type="submit" class="btn btn-primary mt-3">Run Analyses</button>
|
||||
</form>
|
||||
</div>
|
||||
</div>
|
||||
{% include 'includes/error.html' %}
|
||||
</div>
|
||||
</div>
|
||||
</section>
|
||||
|
||||
{% if plot_url %}
|
||||
<section class="container-fluid d-flex justify-content-center">
|
||||
<div class="container-md my-1 mx-2 shadow-lg p-4">
|
||||
<div class="container-sm">
|
||||
<div class="row mt-4">
|
||||
<div class="col">
|
||||
<h4>Selected File: {{ selected_file.split('/')[-1] }}</h4>
|
||||
<img src="{{ plot_url }}" class="img-fluid rounded shadow" alt="User Activity Distribution">
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</section>
|
||||
{% endblock content %}
|
||||
{% endif %}
|
||||
|
||||
{% if results %}
|
||||
{% for analysis_name, result in results.items() %}
|
||||
<section class="container-fluid d-flex justify-content-center">
|
||||
<div class="container-md my-2 mx-2 shadow p-4 pt-0">
|
||||
<div class="container-sm">
|
||||
<div class="results mt-4">
|
||||
<h3>{{ analysis_name }}</h3>
|
||||
<div class="analysis-output">
|
||||
{{ result | safe }} <!-- This allows HTML output -->
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</section>
|
||||
{% endfor %}
|
||||
{% endif %}
|
||||
|
||||
{% endblock %}
|
||||
|
||||
@@ -22,6 +22,9 @@
|
||||
{% block content %}
|
||||
{% endblock %}
|
||||
</main>
|
||||
<footer>
|
||||
{% include 'includes/footer.html' %}
|
||||
</footer>
|
||||
{% block scripts %}
|
||||
{% include 'includes/scripts.html' %}
|
||||
{% endblock %}
|
||||
|
||||
@@ -1,68 +0,0 @@
|
||||
{% extends 'base.html' %}
|
||||
|
||||
{% block content %}
|
||||
<section class="container-fluid d-flex justify-content-center">
|
||||
<div class="container-md my-5 mx-2 shadow-lg p-4 ">
|
||||
<div class="container-sm">
|
||||
<div class="row">
|
||||
<div class="col">
|
||||
<h2>User Activity Distribution</h2>
|
||||
</div>
|
||||
<div class="col text-end">
|
||||
<!-- Dropdown for selecting data file -->
|
||||
<form method="POST" action="{{ url_for('views.data_visualization') }}">
|
||||
<label for="data_file" class="form-label">Choose Data File:</label>
|
||||
<select name="data_file" id="data_file" class="form-select" onchange="this.form.submit()">
|
||||
{% for file in data_files %}
|
||||
<option value="{{ file }}" {% if file == selected_file %}selected{% endif %}>
|
||||
{{ file.split('/')[-1] }}
|
||||
</option>
|
||||
{% endfor %}
|
||||
</select>
|
||||
</form>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{% if error %}
|
||||
<div class="alert alert-danger mt-3" role="alert">
|
||||
{{ error }}
|
||||
</div>
|
||||
{% endif %}
|
||||
|
||||
{% if plot_url %}
|
||||
<div class="row mt-4">
|
||||
<div class="col">
|
||||
<h4>Selected File: {{ selected_file.split('/')[-1] }}</h4>
|
||||
<img src="{{ plot_url }}" class="img-fluid rounded shadow" alt="User Activity Distribution">
|
||||
</div>
|
||||
</div>
|
||||
{% endif %}
|
||||
|
||||
{% if statistics %}
|
||||
<div class="row mt-4">
|
||||
<div class="col">
|
||||
<h2>Activity Statistics</h2>
|
||||
<table class="table table-bordered table-hover">
|
||||
<thead class="table-dark">
|
||||
<tr>
|
||||
<th>Hour</th>
|
||||
<th>Activity Count</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
{% for hour, count in statistics.items() %}
|
||||
<tr>
|
||||
<td>{{ hour }}</td>
|
||||
<td>{{ count }}</td>
|
||||
</tr>
|
||||
{% endfor %}
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
{% endif %}
|
||||
|
||||
</div>
|
||||
</div>
|
||||
</section>
|
||||
{% endblock content %}
|
||||
@@ -18,7 +18,7 @@
|
||||
<table id="dataFilesTable" class="table table-striped table-bordered table-hover">
|
||||
<thead>
|
||||
<tr>
|
||||
<th width="2%"><input type="checkbox" id="checkAllData" onclick="checkAllCheckboxes('dataFilesTable', 'checkAllData')"></th>
|
||||
<th width="2%"><input type="checkbox" class="form-check-input" id="checkAllData" onclick="checkAllCheckboxes('dataFilesTable', 'checkAllData')"></th>
|
||||
<th onclick="sortTable(1, 'dataFilesTable')">File Name</th>
|
||||
<th onclick="sortTable(2, 'dataFilesTable')">Last Modified</th>
|
||||
<th onclick="sortTable(3, 'dataFilesTable')">Created</th>
|
||||
@@ -30,7 +30,7 @@
|
||||
<tbody>
|
||||
{% for file in files.data %}
|
||||
<tr>
|
||||
<td><input type="checkbox" name="fileCheckbox" value="{{ url_for('download_data_file', filename=file.name_display) }}"{{ ' disabled' if file.active }}></td>
|
||||
<td><input type="checkbox" name="fileCheckbox" class="form-check-input" value="{{ url_for('download_data_file', filename=file.name_display) }}"{{ ' disabled' if file.active }}></td>
|
||||
<td><a href="{{ url_for('download_data_file', filename=file.name_display) }}" target="_blank">{{ file.name_display }}</a></td>
|
||||
<td>{{ file.last_modified | datetimeformat }}</td>
|
||||
<td>{{ file.created | datetimeformat }}</td>
|
||||
@@ -67,7 +67,7 @@
|
||||
<table id="logFilesTable" class="table table-striped table-bordered table-hover">
|
||||
<thead>
|
||||
<tr>
|
||||
<th width="2%"><input type="checkbox" id="checkAllLog" onclick="checkAllCheckboxes('logFilesTable', 'checkAllLog')"></th>
|
||||
<th width="2%"><input type="checkbox" id="checkAllLog" class="form-check-input" onclick="checkAllCheckboxes('logFilesTable', 'checkAllLog')"></th>
|
||||
<th onclick="sortTable(1, 'logFilesTable')">File Name</th>
|
||||
<th onclick="sortTable(2, 'logFilesTable')">Last Modified</th>
|
||||
<th onclick="sortTable(3, 'logFilesTable')">Created</th>
|
||||
@@ -79,7 +79,7 @@
|
||||
<tbody>
|
||||
{% for file in files.log %}
|
||||
<tr>
|
||||
<td><input type="checkbox" name="fileCheckbox" value="{{ url_for('download_log_file', filename=file.name_display) }}"{{ ' disabled' if file.active }}></td>
|
||||
<td><input type="checkbox" name="fileCheckbox" class="form-check-input" value="{{ url_for('download_log_file', filename=file.name_display) }}"{{ ' disabled' if file.active }}></td>
|
||||
<td><a href="{{ url_for('download_log_file', filename=file.name_display) }}" target="_blank">{{ file.name_display }}</a></td>
|
||||
<td>{{ file.last_modified | datetimeformat }}</td>
|
||||
<td>{{ file.created | datetimeformat }}</td>
|
||||
@@ -98,8 +98,5 @@
|
||||
</table>
|
||||
</div>
|
||||
</section>
|
||||
{% block scripts %}
|
||||
{{ bootstrap.load_js() }}
|
||||
<script src="{{url_for('.static', filename='download_results.js')}}"></script>
|
||||
{% endblock %}
|
||||
{% endblock content %}
|
||||
6
app/templates/includes/error.html
Normal file
6
app/templates/includes/error.html
Normal file
@@ -0,0 +1,6 @@
|
||||
{% if error %}
|
||||
<div class="alert alert-danger alert-dismissible fade show mt-3" role="alert">
|
||||
<strong>Error:</strong> {{ error }}
|
||||
<button type="button" class="btn-close" data-bs-dismiss="alert" aria-label="Close"></button>
|
||||
</div>
|
||||
{% endif %}
|
||||
@@ -1,9 +1,8 @@
|
||||
<!-- app/templates/includes/navigation.html -->
|
||||
<nav class="navbar navbar-nav navbar-expand-md bg-primary">
|
||||
<div class="container-fluid">
|
||||
<a class="navbar-brand" href="/">Torn User Activity Scraper</a>
|
||||
<a class="navbar-brand" href="/">{{ main_config.APP_TITLE }}</a>
|
||||
{% from 'bootstrap4/nav.html' import render_nav_item %}
|
||||
{{ render_nav_item('views.data_visualization', 'Data Visualization') }}
|
||||
{{ render_nav_item('views.analyze', 'Data Visualization') }}
|
||||
{{ render_nav_item('download_results', 'Files') }}
|
||||
{{ render_nav_item('log_viewer', 'Logs') }}
|
||||
<div class="d-flex" id="color-mode-toggle">
|
||||
@@ -15,3 +14,26 @@
|
||||
</div>
|
||||
</div>
|
||||
</nav>
|
||||
<div id="status_container" class="container-fluid d-flex justify-content-center">
|
||||
<div class="container-md my-1 shadow p-4 pb-0 m-1 w-50" id="status_badges">
|
||||
<div id="loading_indicator" class="alert alert-info">Loading...</div>
|
||||
<div id="status_content">
|
||||
<div class="row justify-content-center">
|
||||
<div class="col col-6 p-1">
|
||||
<div id="activity_indicator" class="alert alert-danger fw-bolder">Inactive</div>
|
||||
</div>
|
||||
<div class="col col-6 p-1">
|
||||
<div id="server_time" class="alert alert-primary">Server Time (TCT):</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="row justify-content-center">
|
||||
<div class="col col-6 p-1">
|
||||
<div id="end_time" class="alert alert-info">Running until:</div>
|
||||
</div>
|
||||
<div class="col p-1">
|
||||
<div id="time-left" class="alert alert-info">Time Left:</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
@@ -1,2 +1,3 @@
|
||||
{{ bootstrap.load_js() }}
|
||||
<script src="{{url_for('static', filename='color_mode.js')}}"></script>
|
||||
<script type="module" src="{{ url_for('static', filename='common.js') }}"></script>
|
||||
@@ -2,7 +2,13 @@
|
||||
{% block content %}
|
||||
<section id="scrapingFormContainer" class="container-fluid d-flex justify-content-center">
|
||||
<div class="container-md my-5 mx-2 shadow-lg p-4 ">
|
||||
<h2>Scraper <span id="activity_indicator" class="badge text-bg-danger">Inactive</span></h2>
|
||||
<div class="row">
|
||||
<div class="col">
|
||||
<h2>Scraper</h2>
|
||||
</div>
|
||||
<div class="col text-end">
|
||||
</div>
|
||||
</div>
|
||||
<form id="scrapingForm" method="POST" action="{{ url_for('start_scraping') }}">
|
||||
{{ form.hidden_tag() }}
|
||||
<div class="form-group">
|
||||
@@ -24,23 +30,5 @@
|
||||
</div>
|
||||
</div>
|
||||
</section>
|
||||
<section id="resultsContainer" class="container-fluid d-flex justify-content-center">
|
||||
<div class="container-md my-5 mx-2 shadow-lg p-4" style="height: 500px;">
|
||||
<div class="row">
|
||||
<div class="col-8">
|
||||
<h2>Logs</h2>
|
||||
<pre id="logs" class="pre-scrollable" style="height: 350px; overflow:scroll; "><code></code></pre>
|
||||
<div class="btn-group btn-group-sm">
|
||||
<button class="btn btn-primary" id="prevPage">Previous</button>
|
||||
<button class="btn btn-primary" id="pageInfo" disabled>Page 1 of 1</button>
|
||||
<button class="btn btn-primary" id="nextPage">Next</button>
|
||||
</div>
|
||||
</div>
|
||||
<div class="col">
|
||||
<h2>Stats</h2>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</section>
|
||||
<script src="{{url_for('static', filename='index.js')}}"></script>
|
||||
<script type="module" src="{{url_for('static', filename='index.js')}}"></script>
|
||||
{% endblock content %}
|
||||
@@ -1,3 +1,22 @@
|
||||
{% extends 'base.html' %}
|
||||
{% block content %}
|
||||
<section id="resultsContainer" class="container-fluid d-flex justify-content-center">
|
||||
<div class="container-md my-5 mx-2 shadow-lg p-4" style="height: 500px;">
|
||||
<div class="row">
|
||||
<div class="col-8">
|
||||
<h2>Logs</h2>
|
||||
<pre id="logs" class="pre-scrollable" style="height: 350px; overflow:scroll;"><code></code></pre>
|
||||
<div class="btn-group btn-group-sm">
|
||||
<button class="btn btn-primary" id="prevPage">Previous</button>
|
||||
<button class="btn btn-primary" id="pageInfo" disabled>Page 1 of 1</button>
|
||||
<button class="btn btn-primary" id="nextPage">Next</button>
|
||||
</div>
|
||||
</div>
|
||||
<div class="col">
|
||||
<h2>Stats</h2>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</section>
|
||||
<script src="{{url_for('static', filename='log_viewer.js')}}"></script>
|
||||
{% endblock content %}
|
||||
@@ -1,13 +1,10 @@
|
||||
import os
|
||||
import zipfile
|
||||
from datetime import datetime, timedelta
|
||||
|
||||
from app.state import data_file_name, log_file_name
|
||||
from flask import current_app
|
||||
|
||||
from app.config import load_config
|
||||
|
||||
config = load_config()
|
||||
|
||||
def create_zip(file_paths, zip_name, app):
|
||||
temp_dir = os.path.abspath(app.config['TEMP']['TEMP_DIR'])
|
||||
zip_path = os.path.join(temp_dir, zip_name)
|
||||
@@ -18,7 +15,7 @@ def create_zip(file_paths, zip_name, app):
|
||||
return zip_path
|
||||
|
||||
def delete_old_zips():
|
||||
temp_dir = os.path.abspath(config['TEMP']['TEMP_DIR'])
|
||||
temp_dir = os.path.abspath(current_app.config['TEMP']['TEMP_DIR'])
|
||||
now = datetime.now()
|
||||
for filename in os.listdir(temp_dir):
|
||||
if filename.endswith('.zip'):
|
||||
@@ -33,7 +30,7 @@ def tail(filename, n):
|
||||
yield ''
|
||||
return
|
||||
|
||||
page_size = int(config['LOGGING']['TAIL_PAGE_SIZE'])
|
||||
page_size = int(current_app.config['LOGGING']['TAIL_PAGE_SIZE'])
|
||||
offsets = []
|
||||
count = _n = n if n >= 0 else -n
|
||||
|
||||
|
||||
154
app/views.py
154
app/views.py
@@ -2,23 +2,27 @@ import os
|
||||
import glob
|
||||
from flask import render_template, Blueprint, current_app, request
|
||||
|
||||
from app.tasks import get_redis
|
||||
|
||||
from app.forms import ScrapingForm
|
||||
from app.util import get_size
|
||||
from app.config import load_config
|
||||
from app.api import scraper as scraper# Import the scraper instance
|
||||
from app.logging_config import get_logger
|
||||
from app.analysis import load_data, generate_statistics, plot_activity_distribution
|
||||
from app.api import scraper as scraper
|
||||
|
||||
from app.analysis import load_data, load_analysis_modules
|
||||
|
||||
from app.state import log_file_name
|
||||
|
||||
print(f"A imported log_file_name: {log_file_name}")
|
||||
|
||||
config = load_config()
|
||||
logger = get_logger()
|
||||
from datetime import datetime
|
||||
|
||||
views_bp = Blueprint("views", __name__)
|
||||
|
||||
def sizeof_fmt(num, suffix="B"):
|
||||
"""Convert bytes to human readable format"""
|
||||
for unit in ["", "Ki", "Mi", "Gi", "Ti", "Pi", "Ei", "Zi"]:
|
||||
if abs(num) < 1024.0:
|
||||
return f"{num:3.1f} {unit}{suffix}"
|
||||
num /= 1024.0
|
||||
return f"{num:.1f} Yi{suffix}"
|
||||
|
||||
def register_views(app):
|
||||
@app.route('/')
|
||||
def index():
|
||||
@@ -29,100 +33,114 @@ def register_views(app):
|
||||
def results():
|
||||
return render_template('results.html')
|
||||
|
||||
@app.route('/analyze')
|
||||
def analyze():
|
||||
return render_template('analyze.html')
|
||||
|
||||
@app.route('/log_viewer')
|
||||
def log_viewer():
|
||||
return render_template('log_viewer.html')
|
||||
|
||||
@app.route('/download_results')
|
||||
def download_results():
|
||||
log_file_name = os.path.abspath(app.config['LOG_FILE_NAME'])
|
||||
scraper = app.config.get('SCRAPER')
|
||||
# Get the current active log file and data file from Redis and app config
|
||||
redis_client = get_redis()
|
||||
current_faction_id = redis_client.get("current_faction_id")
|
||||
|
||||
if scraper:
|
||||
print(scraper.data_file_name)
|
||||
if not scraper:
|
||||
print("Scraper not initialized")
|
||||
active_data_file = None
|
||||
if current_faction_id:
|
||||
active_data_file = redis_client.hget(f"scraper:{current_faction_id}", "data_file_name")
|
||||
|
||||
data_dir = os.path.abspath(config['DATA']['DATA_DIR'])
|
||||
log_dir = os.path.abspath(config['LOGGING']['LOG_DIR'])
|
||||
active_log_file = app.config['LOG_FILE_NAME']
|
||||
|
||||
data_files = glob.glob(os.path.join(data_dir, "*.csv"))
|
||||
log_files = glob.glob(os.path.join(log_dir, "*.log"))
|
||||
def get_file_info(file_path, file_type='data'):
|
||||
stats = os.stat(file_path)
|
||||
name = os.path.basename(file_path)
|
||||
|
||||
# Determine if file is active
|
||||
is_active = False
|
||||
if file_type == 'data' and active_data_file:
|
||||
is_active = os.path.abspath(file_path) == os.path.abspath(active_data_file)
|
||||
elif file_type == 'log' and active_log_file:
|
||||
is_active = os.path.basename(file_path) == os.path.basename(active_log_file)
|
||||
|
||||
def get_file_info(file_path):
|
||||
return {
|
||||
"name": file_path,
|
||||
"name_display": os.path.basename(file_path),
|
||||
"last_modified": os.path.getmtime(file_path),
|
||||
"created": os.path.getctime(file_path),
|
||||
"size": get_size(file_path)
|
||||
'name': file_path, # Full path for internal use
|
||||
'name_display': name, # Just filename for display
|
||||
'last_modified': stats.st_mtime, # Send timestamp instead of datetime
|
||||
'created': stats.st_ctime, # Send timestamp instead of datetime
|
||||
'size': sizeof_fmt(stats.st_size),
|
||||
'active': is_active
|
||||
}
|
||||
|
||||
data_files_info = [get_file_info(file) for file in data_files]
|
||||
log_files_info = [get_file_info(file) for file in log_files]
|
||||
data_files = []
|
||||
log_files = []
|
||||
|
||||
if scraper and scraper.scraping_active:
|
||||
for data_file in data_files_info:
|
||||
if os.path.abspath(scraper.data_file_name) == data_file['name']:
|
||||
data_file['active'] = True
|
||||
else:
|
||||
data_file['active'] = False
|
||||
# Get data files
|
||||
data_dir = os.path.abspath(app.config['DATA']['DATA_DIR'])
|
||||
if os.path.exists(data_dir):
|
||||
for file in glob.glob(os.path.join(data_dir, "*.csv")):
|
||||
data_files.append(get_file_info(file, 'data'))
|
||||
|
||||
for log_file in log_files_info:
|
||||
if log_file_name == os.path.abspath(log_file['name']):
|
||||
log_file['active'] = True
|
||||
else:
|
||||
log_file['active'] = False
|
||||
# Get log files
|
||||
log_dir = os.path.abspath(app.config['LOGGING']['LOG_DIR'])
|
||||
if os.path.exists(log_dir):
|
||||
for file in glob.glob(os.path.join(log_dir, "*.log")):
|
||||
log_files.append(get_file_info(file, 'log'))
|
||||
|
||||
data_files_info.sort(key=lambda x: x['last_modified'], reverse=True)
|
||||
log_files_info.sort(key=lambda x: x['last_modified'], reverse=True)
|
||||
# Sort files by modification time, newest first
|
||||
data_files.sort(key=lambda x: x['last_modified'], reverse=True)
|
||||
log_files.sort(key=lambda x: x['last_modified'], reverse=True)
|
||||
|
||||
files = {"data": data_files_info, "log": log_files_info}
|
||||
files = {
|
||||
'data': data_files,
|
||||
'log': log_files
|
||||
}
|
||||
|
||||
return render_template('download_results.html', files=files)
|
||||
|
||||
views_bp = Blueprint("views", __name__)
|
||||
|
||||
@views_bp.route("/data-visualization", methods=["GET", "POST"])
|
||||
def data_visualization():
|
||||
"""Route to display activity statistics with a visualization."""
|
||||
data_dir = current_app.config["DATA"]["DATA_DIR"]
|
||||
@views_bp.route("/analyze", methods=["GET", "POST"])
|
||||
def analyze():
|
||||
analysis_modules = load_analysis_modules() # Load available analyses
|
||||
data_dir = current_app.config.get("DATA", {}).get("DATA_DIR")
|
||||
|
||||
selected_file = None
|
||||
selected_analyses = []
|
||||
|
||||
# Find all available CSV files
|
||||
data_files = sorted(
|
||||
glob.glob(os.path.join(data_dir, "*.csv")),
|
||||
key=os.path.getmtime,
|
||||
reverse=True
|
||||
)
|
||||
) if data_dir else []
|
||||
|
||||
if not data_files:
|
||||
return render_template("data_visualization.html", error="No data files found.", data_files=[])
|
||||
context = {
|
||||
"data_files": data_files,
|
||||
"analyses": analysis_modules,
|
||||
"selected_file": selected_file,
|
||||
"selected_analyses": selected_analyses
|
||||
}
|
||||
|
||||
# Get the selected file from the dropdown (default to the latest file)
|
||||
selected_file = request.form.get("data_file", data_files[0] if data_files else None)
|
||||
if request.method == "POST":
|
||||
selected_analyses = request.form.getlist("analyses")
|
||||
selected_file = request.form.get("data_file")
|
||||
|
||||
if not selected_file:
|
||||
context["error"] = "No file selected."
|
||||
return render_template("analyze.html", **context)
|
||||
|
||||
if selected_file and os.path.exists(selected_file):
|
||||
df = load_data(selected_file)
|
||||
statistics = generate_statistics(df)
|
||||
results = {}
|
||||
|
||||
# ✅ Generate the plot and get the correct URL path
|
||||
# remove app/ from the base URL
|
||||
plot_url = plot_activity_distribution(df).replace("app/", "")
|
||||
for analysis in analysis_modules:
|
||||
if analysis.name in selected_analyses:
|
||||
results[analysis.name] = analysis.execute(df) # Some may return HTML
|
||||
|
||||
else:
|
||||
return render_template("data_visualization.html", error="Invalid file selection.", data_files=data_files)
|
||||
context["results"] = results
|
||||
|
||||
return render_template(
|
||||
"data_visualization.html",
|
||||
plot_url=plot_url,
|
||||
statistics=statistics.to_dict(),
|
||||
data_files=data_files,
|
||||
selected_file=selected_file
|
||||
)
|
||||
return render_template("analyze.html", **context)
|
||||
|
||||
@views_bp.route('/server_time')
|
||||
def server_time():
|
||||
current_time = datetime.now(datetime.timezone.utc).strftime('%Y-%m-%d %H:%M:%S')
|
||||
return {'server_time': current_time}
|
||||
|
||||
app.register_blueprint(views_bp)
|
||||
|
||||
@@ -1,3 +1,7 @@
|
||||
# All main config options will be passed to template engine
|
||||
[MAIN]
|
||||
APP_TITLE = 'Torn User Activity Grabber'
|
||||
|
||||
[DEFAULT]
|
||||
SECRET_KEY = your_secret_key
|
||||
API_KEY = your_api_key
|
||||
|
||||
20
fly.toml
Normal file
20
fly.toml
Normal file
@@ -0,0 +1,20 @@
|
||||
# fly.toml app configuration file generated for tornactivitytracker on 2025-02-11T02:59:23+01:00
|
||||
#
|
||||
# See https://fly.io/docs/reference/configuration/ for information about how to use this file.
|
||||
#
|
||||
|
||||
app = 'tornactivitytracker'
|
||||
primary_region = 'fra'
|
||||
|
||||
[build]
|
||||
|
||||
[http_service]
|
||||
internal_port = 8080
|
||||
force_https = true
|
||||
auto_stop_machines = 'stop'
|
||||
auto_start_machines = true
|
||||
min_machines_running = 0
|
||||
processes = ['app']
|
||||
|
||||
[[vm]]
|
||||
size = 'shared-cpu-2x'
|
||||
@@ -7,3 +7,7 @@ requests
|
||||
matplotlib
|
||||
seaborn
|
||||
configparser
|
||||
plotly
|
||||
configobj
|
||||
redis
|
||||
celery
|
||||
@@ -4,16 +4,35 @@
|
||||
#
|
||||
# pip-compile requirements.in
|
||||
#
|
||||
amqp==5.3.1
|
||||
# via kombu
|
||||
billiard==4.2.1
|
||||
# via celery
|
||||
blinker==1.9.0
|
||||
# via flask
|
||||
bootstrap-flask==2.4.1
|
||||
# via -r requirements.in
|
||||
celery==5.4.0
|
||||
# via -r requirements.in
|
||||
certifi==2025.1.31
|
||||
# via requests
|
||||
charset-normalizer==3.4.1
|
||||
# via requests
|
||||
click==8.1.8
|
||||
# via flask
|
||||
# via
|
||||
# celery
|
||||
# click-didyoumean
|
||||
# click-plugins
|
||||
# click-repl
|
||||
# flask
|
||||
click-didyoumean==0.3.1
|
||||
# via celery
|
||||
click-plugins==1.1.1
|
||||
# via celery
|
||||
click-repl==0.3.0
|
||||
# via celery
|
||||
configobj==5.0.9
|
||||
# via -r requirements.in
|
||||
configparser==7.1.0
|
||||
# via -r requirements.in
|
||||
contourpy==1.3.1
|
||||
@@ -39,6 +58,8 @@ jinja2==3.1.5
|
||||
# via flask
|
||||
kiwisolver==1.4.8
|
||||
# via matplotlib
|
||||
kombu==5.4.2
|
||||
# via celery
|
||||
markupsafe==3.0.2
|
||||
# via
|
||||
# jinja2
|
||||
@@ -48,28 +69,39 @@ matplotlib==3.10.0
|
||||
# via
|
||||
# -r requirements.in
|
||||
# seaborn
|
||||
numpy==2.2.2
|
||||
narwhals==1.27.1
|
||||
# via plotly
|
||||
numpy==2.2.3
|
||||
# via
|
||||
# contourpy
|
||||
# matplotlib
|
||||
# pandas
|
||||
# seaborn
|
||||
packaging==24.2
|
||||
# via matplotlib
|
||||
# via
|
||||
# matplotlib
|
||||
# plotly
|
||||
pandas==2.2.3
|
||||
# via
|
||||
# -r requirements.in
|
||||
# seaborn
|
||||
pillow==11.1.0
|
||||
# via matplotlib
|
||||
plotly==6.0.0
|
||||
# via -r requirements.in
|
||||
prompt-toolkit==3.0.50
|
||||
# via click-repl
|
||||
pyparsing==3.2.1
|
||||
# via matplotlib
|
||||
python-dateutil==2.9.0.post0
|
||||
# via
|
||||
# celery
|
||||
# matplotlib
|
||||
# pandas
|
||||
pytz==2025.1
|
||||
# via pandas
|
||||
redis==5.2.1
|
||||
# via -r requirements.in
|
||||
requests==2.32.3
|
||||
# via -r requirements.in
|
||||
seaborn==0.13.2
|
||||
@@ -77,9 +109,19 @@ seaborn==0.13.2
|
||||
six==1.17.0
|
||||
# via python-dateutil
|
||||
tzdata==2025.1
|
||||
# via pandas
|
||||
# via
|
||||
# celery
|
||||
# kombu
|
||||
# pandas
|
||||
urllib3==2.3.0
|
||||
# via requests
|
||||
vine==5.1.0
|
||||
# via
|
||||
# amqp
|
||||
# celery
|
||||
# kombu
|
||||
wcwidth==0.2.13
|
||||
# via prompt-toolkit
|
||||
werkzeug==3.1.3
|
||||
# via flask
|
||||
wtforms==3.2.1
|
||||
|
||||
7
run.py
7
run.py
@@ -1,5 +1,6 @@
|
||||
from app.app import init_app
|
||||
from app import create_app
|
||||
|
||||
app = create_app()
|
||||
|
||||
if __name__ == '__main__':
|
||||
app = init_app()
|
||||
app.run(debug=True, threaded=True)
|
||||
app.run(debug=True)
|
||||
50
stop_scraping.py
Normal file
50
stop_scraping.py
Normal file
@@ -0,0 +1,50 @@
|
||||
import redis
|
||||
import argparse
|
||||
|
||||
def get_redis():
|
||||
return redis.StrictRedis(
|
||||
host='localhost',
|
||||
port=6379,
|
||||
db=0,
|
||||
decode_responses=True
|
||||
)
|
||||
|
||||
def stop_scraping(flush=False, force=False):
|
||||
redis_client = get_redis()
|
||||
|
||||
if flush:
|
||||
redis_client.flushall()
|
||||
print("Flushed all Redis data")
|
||||
return True
|
||||
|
||||
current_faction_id = redis_client.get("current_faction_id")
|
||||
|
||||
if not current_faction_id:
|
||||
print("No active scraping session found.")
|
||||
return False if not force else True
|
||||
|
||||
redis_client.hset(f"scraper:{current_faction_id}", "scraping_active", "0")
|
||||
print(f"Sent stop signal to scraping process for faction {current_faction_id}")
|
||||
return True
|
||||
|
||||
if __name__ == "__main__":
|
||||
parser = argparse.ArgumentParser(description='Stop the Torn Activity Tracker scraping process.')
|
||||
parser.add_argument('--force', action='store_true', help='Force stop even if no active session is found')
|
||||
parser.add_argument('--flush', action='store_true', help='Flush all Redis data (WARNING: This will clear ALL Redis data)')
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
if args.flush:
|
||||
if input("WARNING: This will delete ALL Redis data. Are you sure? (y/N) ").lower() != 'y':
|
||||
print("Operation cancelled.")
|
||||
exit(0)
|
||||
|
||||
success = stop_scraping(flush=args.flush, force=args.force)
|
||||
|
||||
if not success and args.force:
|
||||
print("Forcing stop for all potential scraping processes...")
|
||||
redis_client = get_redis()
|
||||
# Get all scraper keys
|
||||
for key in redis_client.keys("scraper:*"):
|
||||
redis_client.hset(key, "scraping_active", "0")
|
||||
print("Sent stop signal to all potential scraping processes.")
|
||||
Reference in New Issue
Block a user