19 Commits

Author SHA1 Message Date
2217bd5855 README.md aktualisiert 2025-02-12 13:25:17 +01:00
fa5d59b069 tests/analyses.ipynb gelöscht 2025-02-12 13:15:32 +01:00
11e6348a8e updates 2025-02-11 02:49:55 +01:00
c078017b5f fucked up the config file 2025-02-11 02:18:49 +01:00
f3da58e202 adds template config variables (like app title) 2025-02-11 02:16:02 +01:00
57e969a647 adds activity indicators in header (ugly af) 2025-02-11 01:58:06 +01:00
0340dea4f8 adds status and server time indicators. fixes checkboxes 2025-02-10 18:22:06 +01:00
2b6aebdab4 moves app initialization to correct file 2025-02-10 17:45:24 +01:00
a6292d2d0f adds ending time to activity indicator 2025-02-10 17:33:37 +01:00
a44c2bfc04 seperates index and log viewer 2025-02-10 16:52:29 +01:00
33621bdec4 refactors logging and config 2025-02-10 16:34:11 +01:00
d1f562ce94 removed state file as current state will be stored in scraper class 2025-02-10 14:12:23 +01:00
5e00df4e13 Merge pull request 'feature/analysis-form' (#10) from feature/analysis-form into master
Reviewed-on: #10
2025-02-10 03:11:57 +01:00
293d3e26a6 Merge pull request 'corrects button display in download_results' (#9) from develop into master
Reviewed-on: #9
2025-02-10 03:11:34 +01:00
ea55c7ad6d adds analysis plugin guide in readme 2025-02-10 03:05:30 +01:00
12e7cffca1 adds check all checkbox 2025-02-10 02:42:27 +01:00
595237c172 adds docstrings 2025-02-10 02:28:50 +01:00
e57869374b fixes #4 - adds modular analyses system using plugins 2025-02-10 02:12:12 +01:00
487d59512a Merge pull request 'adds correct license' (#7) from develop into master
Reviewed-on: #7
2025-02-09 16:07:20 +01:00
42 changed files with 1483 additions and 488 deletions

133
README.md
View File

@@ -1,18 +1,22 @@
# Torn User Activity Scraper # Torn User Activity Tracker
> [!WARNING]
> **Development is still in its early stages; do not put it to productive use!**
## Features ## Features
- Start and stop scraping user activity data - Start and stop scraping user activity data
- View real-time logs - View real-time logs
- Download data and log files - Download data and log files
- View scraping results and statistics - View scraping results
- Plugin based analysis system
- Toggle between light and dark mode - Toggle between light and dark mode
**Note:** Many features are not fully implemented yet, but the activity tracker/grabber works as intended. **Note:** Many features are not fully implemented yet, but the activity tracker/grabber works as intended.
## Planned Features ## Planned Features
- Additional analyses - Additional analyses plugins
- Selector for Torn API data to choose which data shall be tracked - Selector for Torn API data to choose which data shall be tracked
- Improved / fixed log viewer - Improved / fixed log viewer
@@ -93,6 +97,129 @@ flask run
2. Open your web browser and navigate to `http://127.0.0.1:5000/`. 2. Open your web browser and navigate to `http://127.0.0.1:5000/`.
## Adding an Analysis Module
This guide explains how to add a new analysis module using the provided base classes: `BasePlotlyAnalysis` and `BasePlotAnalysis`. These base classes ensure a structured workflow for data preparation, transformation, and visualization.
### 1. Choosing the Right Base Class
Before implementing an analysis module, decide on the appropriate base class:
- **`BasePlotlyAnalysis`**: Use this for interactive plots with **Plotly** that generate **HTML** outputs.
- **`BasePlotAnalysis`**: Use this for static plots with **Matplotlib/Seaborn** that generate **PNG** image files.
- **`BaseAnalysis`**: Use this for any other type of analysis with **text** or **HTML** output for max flexibility.
### 2. Naming Convention
Follow a structured naming convention for consistency:
- **File name:** `plotly_<analysis_name>.py` for Plotly analyses, `plot_<analysis_name>.py` for Matplotlib-based analyses.
- **Class name:** Use PascalCase and a descriptive suffix:
- Example for Plotly: `PlotlyActivityHeatmap`
- Example for Matplotlib: `PlotUserSessionDuration`
### 3. Data Structure
The following DataFrame structure is passed to analysis classes:
| user_id | name | last_action | status | timestamp | prev_timestamp | was_active | hour |
|----------|-----------|----------------------|--------|-----------------------------|----------------|------------|------|
| XXXXXXX | UserA | 2025-02-08 17:58:11 | Okay | 2025-02-08 18:09:41.867984056 | NaT | False | 18 |
| XXXXXXX | UserB | 2025-02-08 17:00:10 | Okay | 2025-02-08 18:09:42.427846909 | NaT | False | 18 |
| XXXXXXX | UserC | 2025-02-08 16:31:52 | Okay | 2025-02-08 18:09:42.823201895 | NaT | False | 18 |
| XXXXXXX | UserD | 2025-02-06 23:57:24 | Okay | 2025-02-08 18:09:43.179914951 | NaT | False | 18 |
| XXXXXXX | UserE | 2025-02-06 06:33:40 | Okay | 2025-02-08 18:09:43.434650898 | NaT | False | 18 |
Note that the first X rows, depending on the number of the members, will always contain empty values in prev_timestamp as there has to be a previous timestamp ....
### 4. Implementing an Analysis Module
Each analysis module should define two key methods:
- `transform_data(self, df: pd.DataFrame) -> pd.DataFrame`: Processes the input data for plotting.
- `plot_data(self, df: pd.DataFrame)`: Generates and saves the plot.
#### Example: Adding a Plotly Heatmap
Below is an example of how to create a new analysis module using `BasePlotlyAnalysis`.
```python
import pandas as pd
import plotly.graph_objects as go
from .basePlotlyAnalysis import BasePlotlyAnalysis
class PlotlyActivityHeatmap(BasePlotlyAnalysis):
"""
Displays user activity trends over multiple days using an interactive heatmap.
"""
name = "Activity Heatmap (Interactive)"
description = "Displays user activity trends over multiple days."
plot_filename = "activity_heatmap.html"
def transform_data(self, df: pd.DataFrame) -> pd.DataFrame:
df['hour'] = df['timestamp'].dt.hour
active_counts = df[df['was_active']].pivot_table(
index='name',
columns='hour',
values='was_active',
aggfunc='sum',
fill_value=0
).reset_index()
return active_counts.melt(id_vars='name', var_name='hour', value_name='activity_count')
def plot_data(self, df: pd.DataFrame):
df = df.pivot(index='name', columns='hour', values='activity_count').fillna(0)
self.fig = go.Figure(data=go.Heatmap(
z=df.values, x=df.columns, y=df.index, colorscale='Viridis',
colorbar=dict(title='Activity Count')
))
self.fig.update_layout(title='User Activity Heatmap', xaxis_title='Hour', yaxis_title='User')
```
#### Example: Adding a Static Matplotlib Plot
Below is an example of a Matplotlib-based analysis module using `BasePlotAnalysis`.
```python
import pandas as pd
import matplotlib.pyplot as plt
from .basePlotAnalysis import BasePlotAnalysis
class PlotUserSessionDuration(BasePlotAnalysis):
"""
Displays a histogram of user session durations.
"""
name = "User Session Duration Histogram"
description = "Histogram of session durations."
plot_filename = "session_duration.png"
def transform_data(self, df: pd.DataFrame) -> pd.DataFrame:
df['session_duration'] = (df['last_action'] - df['timestamp']).dt.total_seconds()
return df
def plot_data(self, df: pd.DataFrame):
plt.figure(figsize=(10, 6))
plt.hist(df['session_duration'].dropna(), bins=30, edgecolor='black')
plt.xlabel('Session Duration (seconds)')
plt.ylabel('Frequency')
plt.title('User Session Duration Histogram')
```
### 5. Registering the Module
Once you have created your analysis module, it will be automatically discovered by `load_analysis_modules()`, provided it is placed in the correct directory.
### 6. Running the Analysis
To execute the analysis, pass a Pandas DataFrame to its `execute` method:
```python
from app.analysis.plotly_activity_heatmap import PlotlyActivityHeatmap
analysis = PlotlyActivityHeatmap()
result_html = analysis.execute(df)
print(result_html) # Returns the HTML for embedding the plot
```
### Summary
- Choose the appropriate base class (`BasePlotlyAnalysis` or `BasePlotAnalysis`).
- Follow the naming convention (`plotly_<name>.py` for Plotly, `plot_<name>.py` for Matplotlib).
- Implement `transform_data()` and `plot_data()` methods.
- The module will be auto-registered if placed in the correct directory.
- Execute the analysis by calling `.execute(df)`.
This structure ensures that new analyses can be easily integrated and maintained.
## License ## License
All assets and code are under the [CC BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/) LICENSE and in the public domain unless specified otherwise. All assets and code are under the [CC BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/) LICENSE and in the public domain unless specified otherwise.

View File

@@ -0,0 +1,50 @@
import os
from flask import Flask
from flask_bootstrap import Bootstrap5
from datetime import datetime
from app.views import register_views
from app.api import register_api
from app.config import load_config
from app.filters import register_filters
from app.logging_config import init_logger
def create_app():
app = Flask(__name__)
os.environ['TZ'] = 'UTC'
config = load_config()
app.config['SECRET_KEY'] = config['DEFAULT']['SECRET_KEY']
# Move bootstrap settings to root level
for key, value in config.get('BOOTSTRAP', {}).items():
app.config[key.upper()] = value
bootstrap = Bootstrap5(app)
# Store the entire config in Flask app
app.config.update(config)
# Initialize other settings
app.config['SCRAPING_ACTIVE'] = False
app.config['SCRAPING_THREAD'] = None
app.config['DATA_FILE_NAME'] = None
app.config['LOG_FILE_NAME'] = "log/" + datetime.now().strftime('%Y-%m-%d-%H-%M') + '.log'
# Initialize logging
app.logger = init_logger(app.config)
# Register routes
register_views(app)
register_api(app)
register_filters(app)
@app.context_processor
def inject_main_config():
main_config = app.config.get('MAIN', {})
return dict(main_config=main_config)
return app

View File

@@ -1,60 +0,0 @@
import os
import pandas as pd
import matplotlib
matplotlib.use("Agg") # Prevents GUI-related issues in Flask
import matplotlib.pyplot as plt
import seaborn as sns
def load_data(file_path: str) -> pd.DataFrame:
"""Loads the scraped data from a CSV file into a Pandas DataFrame."""
if not os.path.exists(file_path):
raise FileNotFoundError(f"File {file_path} not found.")
df = pd.read_csv(file_path)
df["timestamp"] = pd.to_datetime(df["timestamp"], errors="coerce")
df["last_action"] = pd.to_datetime(df["last_action"], errors="coerce")
return df
def generate_statistics(df: pd.DataFrame):
"""Generates activity statistics grouped by hour."""
df["hour"] = df["timestamp"].dt.hour
return df.groupby("hour").size()
def plot_activity_distribution(df: pd.DataFrame, output_path="activity_distribution.png"):
"""Plots user activity distribution and saves the figure."""
# Ensure the directory exists
static_dir = os.path.join("app", "static", "plots")
output_path = os.path.join(static_dir, output_path)
os.makedirs(static_dir, exist_ok=True)
# Convert timestamp column to datetime (if not already)
if not pd.api.types.is_datetime64_any_dtype(df["timestamp"]):
df["timestamp"] = pd.to_datetime(df["timestamp"], errors="coerce")
df["hour"] = df["timestamp"].dt.hour
activity_counts = df.groupby("hour").size().reset_index(name="count")
# Use non-GUI backend for Matplotlib
plt.figure(figsize=(10, 5))
# Fix Seaborn Warning: Assign `hue` explicitly
sns.barplot(x="hour", y="count", data=activity_counts, hue="hour", palette="Blues", legend=False)
plt.xlabel("Hour of the Day")
plt.ylabel("Activity Count")
plt.title("User Activity Distribution")
plt.xticks(range(0, 24))
# Save the plot file safely
plt.savefig(output_path, bbox_inches="tight")
plt.close()
# Verify the file exists after saving
if not os.path.exists(output_path):
raise FileNotFoundError(f"Plot could not be saved to {output_path}")
return output_path

34
app/analysis/__init__.py Normal file
View File

@@ -0,0 +1,34 @@
import os
import pkgutil
import importlib
import inspect
from abc import ABC
from .base import BaseAnalysis
import pandas as pd
def load_analysis_modules():
analysis_modules = []
package_path = __path__[0]
for _, module_name, _ in pkgutil.iter_modules([package_path]):
module = importlib.import_module(f"app.analysis.{module_name}")
for _, obj in inspect.getmembers(module, inspect.isclass):
# Exclude abstract classes (like BasePlotAnalysis)
if issubclass(obj, BaseAnalysis) and obj is not BaseAnalysis and not inspect.isabstract(obj):
analysis_modules.append(obj()) # Instantiate only concrete classes
return analysis_modules
def load_data(file_path: str) -> pd.DataFrame:
"""Loads the scraped data from a CSV file into a Pandas DataFrame."""
if not os.path.exists(file_path):
raise FileNotFoundError(f"File {file_path} not found.")
df = pd.read_csv(file_path)
df["timestamp"] = pd.to_datetime(df["timestamp"], errors="coerce")
df["last_action"] = pd.to_datetime(df["last_action"], errors="coerce")
return df

11
app/analysis/base.py Normal file
View File

@@ -0,0 +1,11 @@
from abc import ABC, abstractmethod
import pandas as pd
class BaseAnalysis(ABC):
name = "Base Analysis"
description = "This is a base analysis module."
@abstractmethod
def execute(self, df: pd.DataFrame):
"""Run analysis on the given DataFrame"""
pass

View File

@@ -0,0 +1,77 @@
import os
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from abc import ABC, abstractmethod
from .base import BaseAnalysis
from app.analysis.data_utils import prepare_data, mk_plotdir
import matplotlib
matplotlib.use('Agg')
# -------------------------------------------
# Base Class for All Plot Analyses
# -------------------------------------------
class BasePlotAnalysis(BaseAnalysis, ABC):
"""
Base class for all plot-based analyses.
It enforces a structure for:
- Data preparation
- Transformation
- Plot generation
- Memory cleanup
Attributes:
plot_filename (str): The filename for the output plot.
alt_text (str): The alt text for the plot.
"""
plot_filename = "default_plot.png"
alt_text = "Default Alt Text"
def execute(self, df: pd.DataFrame):
"""
Executes the full analysis pipeline.
Parameters:
df (pd.DataFrame): The input DataFrame containing user activity data.
Returns:
str: HTML img tag containing the URL to the generated plot.
"""
df = prepare_data(df) # Step 1: Prepare data
paths = mk_plotdir(self.plot_filename)
self.output_path, self.plot_url = paths['output_path'], paths['plot_url']
df = self.transform_data(df) # Step 2: Transform data (implemented by subclass)
self.plot_data(df) # Step 3: Create the plot
plt.savefig(self.output_path, bbox_inches="tight")
plt.close()
del df # Step 4: Free memory
return f'<img src="{self.plot_url}" alt="{self.note}">'
@abstractmethod
def transform_data(self, df: pd.DataFrame) -> pd.DataFrame:
"""
Subclasses must define how they transform the data.
Parameters:
df (pd.DataFrame): The input DataFrame containing user activity data.
Returns:
pd.DataFrame: The transformed DataFrame.
"""
pass
@abstractmethod
def plot_data(self, df: pd.DataFrame):
"""
Subclasses must define how they generate the plot.
Parameters:
df (pd.DataFrame): The transformed DataFrame containing data to be plotted.
"""
pass

View File

@@ -0,0 +1,73 @@
import os
import pandas as pd
import plotly.graph_objects as go
from abc import ABC, abstractmethod
from .base import BaseAnalysis
from app.analysis.data_utils import prepare_data, mk_plotdir
# -------------------------------------------
# Base Class for All Plotly Plot Analyses
# -------------------------------------------
class BasePlotlyAnalysis(BaseAnalysis, ABC):
"""
Base class for all Plotly plot-based analyses.
It enforces a structure for:
- Data preparation
- Transformation
- Plot generation
- Memory cleanup
Attributes:
plot_filename (str): The filename for the output plot.
alt_text (str): The alt text for the plot.
"""
plot_filename = "default_plot.html"
alt_text = "Default Alt Text"
def execute(self, df: pd.DataFrame):
"""
Executes the full analysis pipeline.
Parameters:
df (pd.DataFrame): The input DataFrame containing user activity data.
Returns:
str: HTML iframe containing the URL to the generated plot.
"""
df = prepare_data(df) # Step 1: Prepare data
paths = mk_plotdir(self.plot_filename)
self.output_path, self.plot_url = paths['output_path'], paths['plot_url']
df = self.transform_data(df) # Step 2: Transform data (implemented by subclass)
self.plot_data(df) # Step 3: Create the plot
# Save the plot as an HTML file
self.fig.write_html(self.output_path)
del df # Step 4: Free memory
return f'<iframe src="{self.plot_url}" width="100%" height="600"></iframe>'
@abstractmethod
def transform_data(self, df: pd.DataFrame) -> pd.DataFrame:
"""
Subclasses must define how they transform the data.
Parameters:
df (pd.DataFrame): The input DataFrame containing user activity data.
Returns:
pd.DataFrame: The transformed DataFrame.
"""
pass
@abstractmethod
def plot_data(self, df: pd.DataFrame):
"""
Subclasses must define how they generate the plot.
Parameters:
df (pd.DataFrame): The transformed DataFrame containing data to be plotted.
"""
pass

View File

@@ -0,0 +1,45 @@
from flask import current_app, url_for
import os
import pandas as pd
def prepare_data(df):
"""
Prepares the data for analysis by converting timestamps, calculating previous timestamps,
determining active status, and extracting the hour from the timestamp.
Parameters:
df (pd.DataFrame): The input DataFrame containing user activity data.
Returns:
pd.DataFrame: The processed DataFrame with additional columns for analysis.
The returned DataFrame will have the following columns:
user_id name last_action status timestamp prev_timestamp was_active hour
0 12345678 UserName 2025-02-08 17:58:11 Okay 2025-02-08 18:09:41.867984056 NaT False 18
"""
df["timestamp"] = pd.to_datetime(df["timestamp"])
df["last_action"] = pd.to_datetime(df["last_action"])
df["prev_timestamp"] = df.groupby("user_id")["timestamp"].shift(1)
df["was_active"] = (df["timestamp"] - df["last_action"]) <= pd.Timedelta(seconds=60)
df["was_active"] = df["was_active"].fillna(False)
df['hour'] = df['timestamp'].dt.hour
return df
def mk_plotdir(output_filename):
"""
Creates the directory for storing plots and generates the output path and URL for the plot.
Parameters:
output_filename (str): The filename for the output plot.
Returns:
dict: A dictionary containing the output path and plot URL.
"""
plots_dir = os.path.join(current_app.root_path, "static", "plots")
os.makedirs(plots_dir, exist_ok=True)
output_path = os.path.join(plots_dir, output_filename)
plot_url = url_for('static', filename=f'plots/{output_filename}', _external=True)
return {'output_path': output_path, 'plot_url': plot_url}

View File

@@ -0,0 +1,51 @@
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from .basePlotAnalysis import BasePlotAnalysis
from flask import current_app, url_for
import matplotlib
matplotlib.use('Agg')
class PlotTopActiveUsers(BasePlotAnalysis):
"""
Class for analyzing the most active users and generating a bar chart.
Attributes:
name (str): The name of the analysis.
description (str): A brief description of the analysis.
plot_filename (str): The filename for the output plot.
note (str): Additional notes for the analysis.
"""
name = "Top Active Users"
description = "Displays the most active users based on their number of recorded actions."
plot_filename = "bar_activity-per-user.png"
note = ""
def transform_data(self, df: pd.DataFrame) -> pd.DataFrame:
"""
Transform data for the bar plot.
Parameters:
df (pd.DataFrame): The input DataFrame containing user activity data.
Returns:
pd.DataFrame: The transformed DataFrame with active counts per user.
"""
df = df[df['was_active'] == True].groupby('name').size().reset_index(name='active_count')
return df
def plot_data(self, df: pd.DataFrame):
"""
Generate bar plot.
Parameters:
df (pd.DataFrame): The transformed DataFrame containing active counts per user.
"""
# create a barplot from active counts sorted by active count
plt.figure(figsize=(10, 6))
sns.barplot(x='active_count', y='name', data=df.sort_values('active_count', ascending=False))
plt.xticks(rotation=90)
plt.title('Minutes Active')
plt.xlabel('Player')
plt.ylabel('Active Count')

View File

@@ -0,0 +1,53 @@
import os
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from .basePlotAnalysis import BasePlotAnalysis
import matplotlib
matplotlib.use('Agg')
class PlotPeakHours(BasePlotAnalysis):
"""
Class for analyzing peak activity hours and generating a bar chart.
Attributes:
name (str): The name of the analysis.
description (str): A brief description of the analysis.
plot_filename (str): The filename for the output plot.
note (str): Additional notes for the analysis.
"""
name = "Peak Hours Analysis"
description = "Identifies peak activity hours using a bar chart."
plot_filename = "peak_hours.png"
note = ""
def transform_data(self, df: pd.DataFrame) -> pd.DataFrame:
"""
Transform data to add was_active column and extract peak hours. See data_utils.py.
Parameters:
df (pd.DataFrame): The input DataFrame containing user activity data.
Returns:
pd.DataFrame: The transformed DataFrame with additional columns for analysis.
"""
return df
def plot_data(self, df: pd.DataFrame):
"""
Generate bar chart for peak hours.
Parameters:
df (pd.DataFrame): The transformed DataFrame containing user activity data.
"""
peak_hours = df[df["was_active"]]["hour"].value_counts().sort_index()
plt.figure(figsize=(12, 5))
sns.barplot(x=peak_hours.index, y=peak_hours.values, hue=peak_hours.values, palette="coolwarm")
plt.xlabel("Hour of the Day")
plt.ylabel("Activity Count")
plt.title("Peak Hours of User Activity")
plt.xticks(range(0, 24))

View File

@@ -0,0 +1,55 @@
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from .basePlotAnalysis import BasePlotAnalysis
import matplotlib
matplotlib.use('Agg')
class PlotActivityHeatmap(BasePlotAnalysis):
"""
Class for analyzing user activity trends over multiple days and generating a heatmap.
Attributes:
name (str): The name of the analysis.
description (str): A brief description of the analysis.
plot_filename (str): The filename for the output plot.
note (str): Additional notes for the analysis.
"""
name = "Activity Heatmap"
description = "Displays user activity trends over multiple days using a heatmap. Generates a downloadable PNG image."
plot_filename = "activity_heatmap.png"
note = ""
def transform_data(self, df: pd.DataFrame) -> pd.DataFrame:
"""
Transform data for the heatmap.
Parameters:
df (pd.DataFrame): The input DataFrame containing user activity data.
Returns:
pd.DataFrame: The transformed DataFrame with activity counts by hour.
"""
active_counts = df[df['was_active']].pivot_table(
index='name',
columns='hour',
values='was_active',
aggfunc='sum',
fill_value=0
)
active_counts['total_active_minutes'] = active_counts.sum(axis=1)
return active_counts.sort_values(by='total_active_minutes', ascending=False)
def plot_data(self, df: pd.DataFrame):
"""
Generate heatmap plot.
Parameters:
df (pd.DataFrame): The transformed DataFrame containing activity counts by hour.
"""
plt.figure(figsize=(12, 8))
sns.heatmap(df.loc[:, df.columns != 'total_active_minutes'], cmap='viridis', cbar_kws={'label': 'Count of was_active == True'})
plt.xlabel('Hour of Day')
plt.ylabel('User ID')
plt.title('User Activity Heatmap')

View File

@@ -0,0 +1,67 @@
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from .basePlotAnalysis import BasePlotAnalysis
from flask import current_app, url_for
import matplotlib
matplotlib.use('Agg')
class PlotLineActivityAllUsers(BasePlotAnalysis):
"""
Class for analyzing user activity trends over multiple days and generating a line graph.
Attributes:
name (str): The name of the analysis.
description (str): A brief description of the analysis.
plot_filename (str): The filename for the output plot.
note (str): Additional notes for the analysis.
"""
name = "Activity Line Graph (All Users)"
description = "This analysis shows the activity line graph for all users. Gneerates a downloadable PNG image."
plot_filename = "line_activity-all_users.png"
note = ""
def transform_data(self, df: pd.DataFrame) -> pd.DataFrame:
"""
Transform data for the line plot.
Parameters:
df (pd.DataFrame): The input DataFrame containing user activity data.
Returns:
pd.DataFrame: The transformed DataFrame with activity counts by hour.
"""
df['hour'] = df['timestamp'].dt.hour
df = df[df['was_active'] == True].pivot_table(index='name', columns='hour', values='was_active', aggfunc='sum', fill_value=0)
df['total_active_minutes'] = df.sum(axis=1)
df = df.sort_values(by='total_active_minutes', ascending=False).drop('total_active_minutes', axis=1)
cumulative_sum_row = df.cumsum().iloc[-1]
df.loc['Cumulative Sum'] = cumulative_sum_row
return df
def plot_data(self, df: pd.DataFrame):
"""
Generate line graph for user activity throughout the day.
Parameters:
df (pd.DataFrame): The transformed DataFrame containing activity counts by hour.
"""
plt.figure(figsize=(12, 6))
# Plot each user's activity
for index, row in df.iterrows():
if index == 'Cumulative Sum':
plt.plot(row.index, row.values, label=index, linewidth=3, color='black') # Bold line for cumulative sum
else:
plt.plot(row.index, row.values, label=index)
# Add labels and title
plt.xlabel('Hour of Day')
plt.ylabel('Activity Count')
plt.title('User Activity Throughout the Day')
plt.legend(loc='upper left', bbox_to_anchor=(1, 1))
plt.grid(True)

View File

@@ -0,0 +1,82 @@
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from .basePlotlyAnalysis import BasePlotlyAnalysis
from flask import current_app, url_for
class PlotlyActivityHeatmap(BasePlotlyAnalysis):
"""
Class for analyzing user activity trends over multiple days and generating an interactive heatmap.
Attributes:
name (str): The name of the analysis.
description (str): A brief description of the analysis.
plot_filename (str): The filename for the output plot.
note (str): Additional notes for the analysis.
"""
name = "Activity Heatmap (Interactive)"
description = "Displays user activity trends over multiple days using an interactive heatmap."
plot_filename = "activity_heatmap.html"
note = ""
def transform_data(self, df: pd.DataFrame) -> pd.DataFrame:
"""
Transform data for the heatmap.
Parameters:
df (pd.DataFrame): The input DataFrame containing user activity data.
Returns:
pd.DataFrame: The transformed DataFrame with activity counts by hour.
"""
df['hour'] = df['timestamp'].dt.hour
active_counts = df[df['was_active']].pivot_table(
index='name',
columns='hour',
values='was_active',
aggfunc='sum',
fill_value=0
).reset_index()
# Ensure all hours are represented
all_hours = pd.DataFrame({'hour': range(24)})
active_counts = active_counts.melt(id_vars='name', var_name='hour', value_name='activity_count')
active_counts = active_counts.merge(all_hours, on='hour', how='right').fillna(0)
active_counts['hour'] = active_counts['hour'].astype(int) # Ensure hour is treated as numeric
return active_counts
def plot_data(self, df: pd.DataFrame):
"""
Generate heatmap plot.
Parameters:
df (pd.DataFrame): The transformed DataFrame containing activity counts by hour.
"""
df = df.pivot(index='name', columns='hour', values='activity_count').fillna(0)
# Create a Plotly heatmap
self.fig = go.Figure(data=go.Heatmap(
z=df.values,
x=df.columns,
y=df.index,
colorscale='Viridis',
colorbar=dict(title='Count of was_active == True')
))
# Update layout
self.fig.update_layout(
title='User Activity Heatmap',
xaxis_title='Hour of Day',
yaxis_title='User ID',
xaxis=dict(tickmode='linear', dtick=1, range=[0, 23]), # Ensure x-axis covers all hours
template='plotly_white'
)
self.fig.update_traces(
hovertemplate="<br>".join([
"Hour: %{x}",
"Name: %{y}",
"Activity: %{z}",
])
)

View File

@@ -0,0 +1,65 @@
import pandas as pd
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from .basePlotlyAnalysis import BasePlotlyAnalysis
from flask import current_app, url_for
class PlotlyLineActivityAllUsers(BasePlotlyAnalysis):
"""
Class for analyzing user activity trends over multiple days and generating an interactive line graph.
Attributes:
name (str): The name of the analysis.
description (str): A brief description of the analysis.
plot_filename (str): The filename for the output plot.
note (str): Additional notes for the analysis.
"""
name = "Activity Line Graph (All Users, Interactive)"
description = "This analysis shows the activity line graph for all users. The graph is interactive and can be used to explore the data."
plot_filename = "line_activity-all_users.html"
note = ""
def transform_data(self, df: pd.DataFrame) -> pd.DataFrame:
"""
Transform data for the line plot.
Parameters:
df (pd.DataFrame): The input DataFrame containing user activity data.
Returns:
pd.DataFrame: The transformed DataFrame with activity counts by hour.
"""
df['hour'] = df['timestamp'].dt.hour
df = df[df['was_active'] == True].pivot_table(index='name', columns='hour', values='was_active', aggfunc='sum', fill_value=0)
df['total_active_minutes'] = df.sum(axis=1)
df = df.sort_values(by='total_active_minutes', ascending=False).drop('total_active_minutes', axis=1)
cumulative_sum_row = df.cumsum().iloc[-1]
df.loc['Cumulative Sum'] = cumulative_sum_row
return df
def plot_data(self, df: pd.DataFrame):
"""
Generate interactive line graph for user activity throughout the day.
Parameters:
df (pd.DataFrame): The transformed DataFrame containing activity counts by hour.
"""
self.fig = make_subplots()
# Plot each user's activity
for index, row in df.iterrows():
if index == 'Cumulative Sum':
self.fig.add_trace(go.Scatter(x=row.index, y=row.values, mode='lines', name=index, line=dict(width=3, color='black'))) # Bold line for cumulative sum
else:
self.fig.add_trace(go.Scatter(x=row.index, y=row.values, mode='lines', name=index))
self.fig.update_layout(
title='User Activity Throughout the Day',
xaxis_title='Hour of Day',
yaxis_title='Activity Count',
legend_title='User',
legend=dict(x=1, y=1),
template='plotly_white'
)

View File

@@ -0,0 +1,31 @@
import pandas as pd
from .base import BaseAnalysis
from flask import render_template_string
class GenerateStatistics(BaseAnalysis):
name = "Test Statistics (Placeholder)"
description = "Generates activity statistics grouped by hour."
def execute(self, df: pd.DataFrame):
df["hour"] = df["timestamp"].dt.hour
statistics = df.groupby("hour").size().reset_index(name="count")
# Convert statistics DataFrame to HTML
table_html = statistics.to_html(classes="table table-bordered table-striped")
# Wrap it in Bootstrap styling
html_content = render_template_string(
"""
<div class="card mt-3">
<div class="card-header">
<h4>Activity Statistics</h4>
</div>
<div class="card-body">
{{ table_html | safe }}
</div>
</div>
""",
table_html=table_html
)
return html_content

View File

@@ -6,16 +6,11 @@ import glob
from datetime import datetime from datetime import datetime
import pandas as pd import pandas as pd
from app.models import Scraper, generate_statistics from app.models import Scraper
from app.util import create_zip, delete_old_zips, tail, get_size from app.util import create_zip, delete_old_zips, tail
from app.config import load_config from app.config import load_config
from app.logging_config import get_logger
from app.forms import ScrapingForm from app.forms import ScrapingForm
config = load_config()
logger = get_logger()
log_file_name = logger.handlers[0].baseFilename
scraping_thread = None scraping_thread = None
scraper = None scraper = None
scrape_lock = threading.Lock() scrape_lock = threading.Lock()
@@ -23,10 +18,11 @@ scrape_lock = threading.Lock()
def register_api(app): def register_api(app):
@app.route('/start_scraping', methods=['POST']) @app.route('/start_scraping', methods=['POST'])
def start_scraping(): def start_scraping():
global scraping_thread, scraper
with scrape_lock: with scrape_lock:
scraper = current_app.config.get('SCRAPER') scraper = current_app.config.get('SCRAPER')
if scraper is not None and scraper.scraping_active: if scraper is not None and scraper.scraping_active:
logger.warning("Can't start scraping process: scraping already in progress") current_app.logger.warning("Can't start scraping process: scraping already in progress")
return jsonify({"status": "Scraping already in progress"}) return jsonify({"status": "Scraping already in progress"})
form = ScrapingForm() form = ScrapingForm()
@@ -35,10 +31,10 @@ def register_api(app):
fetch_interval = form.fetch_interval.data fetch_interval = form.fetch_interval.data
run_interval = form.run_interval.data run_interval = form.run_interval.data
scraper = Scraper(faction_id, fetch_interval, run_interval, current_app) scraper = Scraper(faction_id, fetch_interval, run_interval, app)
scraper.scraping_active = True scraper.scraping_active = True
scraping_thread = threading.Thread(target=scraper.start_scraping) scraping_thread = threading.Thread(target=scraper.start_scraping, args=(app,))
scraping_thread.daemon = True scraping_thread.daemon = True
scraping_thread.start() scraping_thread.start()
@@ -56,19 +52,21 @@ def register_api(app):
scraper.stop_scraping() scraper.stop_scraping()
current_app.config['SCRAPING_ACTIVE'] = False current_app.config['SCRAPING_ACTIVE'] = False
logger.debug("Scraping stopped by user") current_app.logger.debug("Scraping stopped by user")
return jsonify({"status": "Scraping stopped"}) return jsonify({"status": "Scraping stopped"})
@app.route('/logfile', methods=['GET']) @app.route('/logfile', methods=['GET'])
def logfile(): def logfile():
log_file_name = current_app.logger.handlers[0].baseFilename
page = int(request.args.get('page', 0)) # Page number page = int(request.args.get('page', 0)) # Page number
lines_per_page = int(request.args.get('lines_per_page', config['LOGGING']['VIEW_PAGE_LINES'])) # Lines per page lines_per_page = int(request.args.get('lines_per_page', current_app.config['LOGGING']['VIEW_PAGE_LINES'])) # Lines per page
log_file_path = log_file_name # Path to the current log file log_file_path = log_file_name # Path to the current log file
if not os.path.isfile(log_file_path): if not os.path.isfile(log_file_path):
logger.error("Log file not found") current_app.logger.error("Log file not found")
return jsonify({"error": "Log file not found"}), 404 return jsonify({"error": "Log file not found"}), 404
log_lines = list(tail(log_file_path, config['LOGGING']['VIEW_MAX_LINES'])) log_lines = list(tail(log_file_path, current_app.config['LOGGING']['VIEW_MAX_LINES']))
log_lines = log_lines[::-1] # Reverse the list log_lines = log_lines[::-1] # Reverse the list
@@ -123,14 +121,15 @@ def register_api(app):
@app.route('/delete_files', methods=['POST']) @app.route('/delete_files', methods=['POST'])
def delete_files(): def delete_files():
log_file_name = current_app.logger.handlers[0].baseFilename
file_paths = request.json.get('file_paths', []) file_paths = request.json.get('file_paths', [])
if not file_paths: if not file_paths:
return jsonify({"error": "No files specified"}), 400 return jsonify({"error": "No files specified"}), 400
errors = [] errors = []
data_dir = os.path.abspath(config['DATA']['DATA_DIR']) data_dir = os.path.abspath(current_app.config['DATA']['DATA_DIR'])
log_dir = os.path.abspath(config['LOGGING']['LOG_DIR']) log_dir = os.path.abspath(current_app.config['LOGGING']['LOG_DIR'])
for file_path in file_paths: for file_path in file_paths:
if file_path.startswith('/data/'): if file_path.startswith('/data/'):
@@ -171,40 +170,46 @@ def register_api(app):
@app.route('/data/<path:filename>') @app.route('/data/<path:filename>')
def download_data_file(filename): def download_data_file(filename):
data_dir = os.path.abspath(config['DATA']['DATA_DIR']) data_dir = os.path.abspath(current_app.config['DATA']['DATA_DIR'])
file_path = os.path.join(data_dir, filename) file_path = os.path.join(data_dir, filename)
return send_from_directory(directory=data_dir, path=filename, as_attachment=True) return send_from_directory(directory=data_dir, path=filename, as_attachment=True)
@app.route('/log/<path:filename>') @app.route('/log/<path:filename>')
def download_log_file(filename): def download_log_file(filename):
log_dir = os.path.abspath(config['LOGGING']['LOG_DIR']) log_dir = os.path.abspath(current_app.config['LOGGING']['LOG_DIR'])
file_path = os.path.join(log_dir, filename) file_path = os.path.join(log_dir, filename)
return send_from_directory(directory=log_dir, path=filename, as_attachment=True) return send_from_directory(directory=log_dir, path=filename, as_attachment=True)
@app.route('/tmp/<path:filename>') @app.route('/tmp/<path:filename>')
def download_tmp_file(filename): def download_tmp_file(filename):
tmp_dir = os.path.abspath(config['TEMP']['TEMP_DIR']) tmp_dir = os.path.abspath(current_app.config['TEMP']['TEMP_DIR'])
file_path = os.path.join(tmp_dir, filename) file_path = os.path.join(tmp_dir, filename)
return send_from_directory(directory=tmp_dir, path=filename, as_attachment=True) return send_from_directory(directory=tmp_dir, path=filename, as_attachment=True)
@app.route('/config/lines_per_page') @app.route('/config/lines_per_page')
def get_lines_per_page(): def get_lines_per_page():
lines_per_page = config['LOGGING']['VIEW_PAGE_LINES'] lines_per_page = current_app.config['LOGGING']['VIEW_PAGE_LINES']
return jsonify({"lines_per_page": lines_per_page}) return jsonify({"lines_per_page": lines_per_page})
@app.route('/scraping_status', methods=['GET']) @app.route('/scraping_status', methods=['GET'])
def scraping_status(): def scraping_status():
if scraper is None: if scraper is None:
logger.debug("Scraper is not initialized.") current_app.logger.debug("Scraper is not initialized.")
return jsonify({"scraping_active": False}) return jsonify({"scraping_active": False})
if scraper.scraping_active: if scraper.scraping_active:
logger.debug("Scraping is active.") current_app.logger.debug("Scraping is active.")
return jsonify({"scraping_active": True}) return jsonify({"scraping_active": True})
else: else:
logger.debug("Scraping is not active.") current_app.logger.debug("Scraping is not active.")
return jsonify({"scraping_active": False}) return jsonify({"scraping_active": False})
@app.route('/scraping_get_end_time')
def scraping_get_end_time():
if scraper is None:
current_app.logger.debug("Scraper is not initialized.")
return jsonify({"scraping_active":False})
return jsonify({"end_time": scraper.end_time})

View File

@@ -1,42 +0,0 @@
from flask import Flask
from flask_bootstrap import Bootstrap5
from datetime import datetime
from app.views import register_views
from app.api import register_api
from app.config import load_config
from app.filters import register_filters
from app.analysis import generate_statistics
def init_app():
config = load_config()
# Initialize app
app = Flask(__name__)
# Load configuration
app.config['SECRET_KEY'] = config['DEFAULT']['SECRET_KEY']
app.config['API_KEY'] = config['DEFAULT']['API_KEY']
app.config['DATA'] = config['DATA']
app.config['TEMP'] = config['TEMP']
app.config['LOGGING'] = config['LOGGING']
# Move bootstrap settings to root level
for key in config['BOOTSTRAP']:
app.config[key.upper()] = config['BOOTSTRAP'][key]
bootstrap = Bootstrap5(app)
# Initialize global variables
app.config['SCRAPING_ACTIVE'] = False
app.config['SCRAPING_THREAD'] = None
app.config['DATA_FILE_NAME'] = None
app.config['LOG_FILE_NAME'] = "log/" + datetime.now().strftime('%Y-%m-%d-%H-%M') + '.log'
# Register routes
register_views(app)
register_api(app)
register_filters(app)
return app

View File

@@ -1,7 +1,8 @@
import configparser from configobj import ConfigObj
import os import os
def load_config(): def load_config():
config = configparser.ConfigParser() config_path = os.path.join(os.path.dirname(__file__), '..', 'config.ini')
config.read(os.path.join(os.path.dirname(__file__), '..', 'config.ini'))
return config # Load config while preserving sections as nested dicts
return ConfigObj(config_path)

View File

@@ -4,36 +4,31 @@ from queue import Queue
import os import os
from datetime import datetime from datetime import datetime
from app.config import load_config from flask import current_app
config = load_config() def init_logger(config):
LOG_DIR = config.get('LOGGING', {}).get('LOG_DIR', 'log')
# Define the log directory and ensure it exists if not os.path.exists(LOG_DIR):
LOG_DIR = config['LOGGING']['LOG_DIR'] os.makedirs(LOG_DIR)
if not os.path.exists(LOG_DIR):
os.makedirs(LOG_DIR)
# Generate the log filename dynamically log_file_name = os.path.join(LOG_DIR, datetime.now().strftime('%Y-%m-%d-%H-%M') + '.log')
log_file_name = os.path.join(LOG_DIR, datetime.now().strftime('%Y-%m-%d-%H-%M') + '.log')
# Initialize the logger logger = logging.getLogger(__name__)
logger = logging.getLogger(__name__) logger.setLevel(logging.DEBUG)
logger.setLevel(logging.DEBUG)
# File handler file_handler = logging.FileHandler(log_file_name, mode='w')
file_handler = logging.FileHandler(log_file_name, mode='w') file_handler.setLevel(logging.DEBUG)
file_handler.setLevel(logging.DEBUG) formatter = logging.Formatter('%(asctime)s - %(levelname)s: %(message)s',
formatter = logging.Formatter('%(asctime)s - %(levelname)s: %(message)s', datefmt='%m/%d/%Y %I:%M:%S %p')
datefmt='%m/%d/%Y %I:%M:%S %p') file_handler.setFormatter(formatter)
file_handler.setFormatter(formatter) logger.addHandler(file_handler)
logger.addHandler(file_handler)
# Queue handler for real-time logging log_queue = Queue()
log_queue = Queue() queue_handler = QueueHandler(log_queue)
queue_handler = QueueHandler(log_queue) queue_handler.setLevel(logging.DEBUG)
queue_handler.setLevel(logging.DEBUG) logger.addHandler(queue_handler)
logger.addHandler(queue_handler)
logger.debug("Logger initialized")
# Function to get logger in other modules
def get_logger():
return logger return logger

View File

@@ -6,14 +6,7 @@ import time
from datetime import datetime, timedelta from datetime import datetime, timedelta
from requests.exceptions import ConnectionError, Timeout, RequestException from requests.exceptions import ConnectionError, Timeout, RequestException
from app.logging_config import get_logger from flask import current_app
from app.config import load_config
config = load_config()
API_KEY = config['DEFAULT']['API_KEY']
logger = get_logger()
class Scraper: class Scraper:
def __init__(self, faction_id, fetch_interval, run_interval, app): def __init__(self, faction_id, fetch_interval, run_interval, app):
@@ -23,19 +16,21 @@ class Scraper:
self.end_time = datetime.now() + timedelta(days=run_interval) self.end_time = datetime.now() + timedelta(days=run_interval)
self.data_file_name = os.path.join(app.config['DATA']['DATA_DIR'], f"{self.faction_id}-{datetime.now().strftime('%Y-%m-%d-%H-%M')}.csv") self.data_file_name = os.path.join(app.config['DATA']['DATA_DIR'], f"{self.faction_id}-{datetime.now().strftime('%Y-%m-%d-%H-%M')}.csv")
self.scraping_active = False self.scraping_active = False
self.API_KEY = app.config['DEFAULT']['API_KEY']
self.logger = app.logger
print(self.data_file_name) print(self.data_file_name)
def fetch_faction_data(self): def fetch_faction_data(self):
url = f"https://api.torn.com/faction/{self.faction_id}?selections=&key={API_KEY}" url = f"https://api.torn.com/faction/{self.faction_id}?selections=&key={self.API_KEY}"
response = requests.get(url) response = requests.get(url)
if response.status_code == 200: if response.status_code == 200:
return response.json() return response.json()
logger.warning(f"Failed to fetch faction data for faction ID {self.faction_id}. Response: {response.text}") current_app.logger.warning(f"Failed to fetch faction data for faction ID {self.faction_id}. Response: {response.text}")
return None return None
def fetch_user_activity(self, user_id): def fetch_user_activity(self, user_id):
url = f"https://api.torn.com/user/{user_id}?selections=basic,profile&key={API_KEY}" url = f"https://api.torn.com/user/{user_id}?selections=basic,profile&key={self.API_KEY}"
retries = 3 retries = 3
for attempt in range(retries): for attempt in range(retries):
try: try:
@@ -43,45 +38,51 @@ class Scraper:
response.raise_for_status() response.raise_for_status()
return response.json() return response.json()
except ConnectionError as e: except ConnectionError as e:
logger.error(f"Connection error while fetching user activity for user ID {user_id}: {e}") current_app.logger.error(f"Connection error while fetching user activity for user ID {user_id}: {e}")
except Timeout as e: except Timeout as e:
logger.error(f"Timeout error while fetching user activity for user ID {user_id}: {e}") current_app.logger.error(f"Timeout error while fetching user activity for user ID {user_id}: {e}")
except RequestException as e: except RequestException as e:
logger.error(f"Error while fetching user activity for user ID {user_id}: {e}") current_app.logger.error(f"Error while fetching user activity for user ID {user_id}: {e}")
if attempt < retries - 1: if attempt < retries - 1:
current_app.logger.debug(f"Retrying {attempt + 1}/{retries} for user {user_id}")
time.sleep(2 ** attempt) # Exponential backoff time.sleep(2 ** attempt) # Exponential backoff
return None return None
def start_scraping(self) -> None: def start_scraping(self, app) -> None:
"""Starts the scraping process until the end time is reached or stopped manually.""" """Starts the scraping process until the end time is reached or stopped manually."""
self.scraping_active = True self.scraping_active = True
logger.info(f"Starting scraping for faction ID {self.faction_id}")
logger.debug(f"Fetch interval: {self.fetch_interval}s, Run interval: {self.run_interval} days, End time: {self.end_time}")
MAX_FAILURES = 5 # Stop after 5 consecutive failures # Anwendungskontext explizit setzen
failure_count = 0 with app.app_context():
current_app.logger.info(f"Starting scraping for faction ID {self.faction_id}")
current_app.logger.debug(f"Fetch interval: {self.fetch_interval}s, Run interval: {self.run_interval} days, End time: {self.end_time}")
while datetime.now() < self.end_time and self.scraping_active: MAX_FAILURES = 5 # Stop after 5 consecutive failures
logger.info(f"Fetching data at {datetime.now()}") failure_count = 0
faction_data = self.fetch_faction_data()
if not faction_data or "members" not in faction_data: while datetime.now() < self.end_time and self.scraping_active:
logger.warning(f"No faction data found for ID {self.faction_id} (Failure {failure_count + 1}/{MAX_FAILURES})") current_app.logger.info(f"Fetching data at {datetime.now()}")
failure_count += 1 faction_data = self.fetch_faction_data()
if failure_count >= MAX_FAILURES:
logger.error(f"Max failures reached ({MAX_FAILURES}). Stopping scraping.") if not faction_data or "members" not in faction_data:
break current_app.logger.warning(f"No faction data found for ID {self.faction_id} (Failure {failure_count + 1}/{MAX_FAILURES})")
failure_count += 1
if failure_count >= MAX_FAILURES:
current_app.logger.error(f"Max failures reached ({MAX_FAILURES}). Stopping scraping.")
break
time.sleep(self.fetch_interval)
continue
current_app.logger.info(f"Fetched {len(faction_data['members'])} members for faction {self.faction_id}")
failure_count = 0 # Reset failure count on success
user_activity_data = self.process_faction_members(faction_data["members"])
self.save_data(user_activity_data)
current_app.logger.info(f"Data appended to {self.data_file_name}")
time.sleep(self.fetch_interval) time.sleep(self.fetch_interval)
continue
failure_count = 0 # Reset failure count on success self.handle_scraping_end()
user_activity_data = self.process_faction_members(faction_data["members"])
self.save_data(user_activity_data)
logger.info(f"Data appended to {self.data_file_name}")
time.sleep(self.fetch_interval)
self.handle_scraping_end()
def process_faction_members(self, members: Dict[str, Dict]) -> List[Dict]: def process_faction_members(self, members: Dict[str, Dict]) -> List[Dict]:
"""Processes and retrieves user activity for all faction members.""" """Processes and retrieves user activity for all faction members."""
@@ -96,16 +97,16 @@ class Scraper:
"status": user_activity.get("status", {}).get("state", ""), "status": user_activity.get("status", {}).get("state", ""),
"timestamp": datetime.now().timestamp(), "timestamp": datetime.now().timestamp(),
}) })
logger.info(f"Fetched data for user {user_id} ({user_activity.get('name', '')})") current_app.logger.info(f"Fetched data for user {user_id} ({user_activity.get('name', '')})")
else: else:
logger.warning(f"Failed to fetch data for user {user_id}") current_app.logger.warning(f"Failed to fetch data for user {user_id}")
return user_activity_data return user_activity_data
def save_data(self, user_activity_data: List[Dict]) -> None: def save_data(self, user_activity_data: List[Dict]) -> None:
"""Saves user activity data to a CSV file.""" """Saves user activity data to a CSV file."""
if not user_activity_data: if not user_activity_data:
logger.warning("No data to save.") current_app.logger.warning("No data to save.")
return return
df = pd.DataFrame(user_activity_data) df = pd.DataFrame(user_activity_data)
@@ -117,26 +118,22 @@ class Scraper:
try: try:
with open(self.data_file_name, "a" if file_exists else "w") as f: with open(self.data_file_name, "a" if file_exists else "w") as f:
df.to_csv(f, mode="a" if file_exists else "w", header=not file_exists, index=False) df.to_csv(f, mode="a" if file_exists else "w", header=not file_exists, index=False)
logger.info(f"Data successfully saved to {self.data_file_name}") current_app.logger.info(f"Data successfully saved to {self.data_file_name}")
except Exception as e: except Exception as e:
logger.error(f"Error saving data to {self.data_file_name}: {e}") current_app.logger.error(f"Error saving data to {self.data_file_name}: {e}")
def handle_scraping_end(self) -> None: def handle_scraping_end(self) -> None:
"""Handles cleanup and logging when scraping ends.""" """Handles cleanup and logging when scraping ends."""
if not self.scraping_active: if not self.scraping_active:
logger.warning(f"Scraping stopped manually at {datetime.now()}") current_app.logger.warning(f"Scraping stopped manually at {datetime.now()}")
elif datetime.now() >= self.end_time: elif datetime.now() >= self.end_time:
logger.warning(f"Scraping stopped due to timeout at {datetime.now()} (Run interval: {self.run_interval} days)") current_app.logger.warning(f"Scraping stopped due to timeout at {datetime.now()} (Run interval: {self.run_interval} days)")
else: else:
logger.error(f"Unexpected stop at {datetime.now()}") current_app.logger.error(f"Unexpected stop at {datetime.now()}")
logger.info("Scraping completed.") current_app.logger.info("Scraping completed.")
self.scraping_active = False self.scraping_active = False
def stop_scraping(self): def stop_scraping(self):
self.scraping_active = False self.scraping_active = False
logger.debug("Scraping stopped by user") current_app.logger.debug("Scraping stopped by user")
def generate_statistics(df):
df['hour'] = df['timestamp'].dt.hour # No need to convert timestamp again
return df.groupby('hour').size() # Activity by hour

View File

@@ -1,2 +0,0 @@
data_file_name = None
log_file_name = None

38
app/static/common.js Normal file
View File

@@ -0,0 +1,38 @@
import { ScraperUtils } from './scraper_utils.js';
class Common {
constructor() {
this.utils = new ScraperUtils();
this.addEventListeners();
this.scheduleUpdates();
}
scheduleUpdates() {
// Ensure server time updates every minute but only after initial fetch
setTimeout(() => {
setInterval(() => this.utils.updateServerTime(), 60000);
}, 5000); // Delay first scheduled update to prevent duplicate initial request
}
addEventListeners() {
if (this.utils.stopButton) {
this.utils.stopButton.addEventListener('click', () => this.utils.checkScrapingStatus());
}
}
}
document.addEventListener('DOMContentLoaded', () => {
new Common();
});
window.checkAllCheckboxes = function(tableId, checkAllId) {
var table = document.getElementById(tableId);
var checkAll = document.getElementById(checkAllId);
var checkboxes = table.querySelectorAll('input[type="checkbox"]');
checkboxes.forEach(function(checkbox) {
if (!checkbox.disabled) {
checkbox.checked = checkAll.checked;
}
});
};

View File

@@ -94,11 +94,3 @@ function sortTable(columnIndex, tableId) {
// Reinsert sorted rows // Reinsert sorted rows
rows.forEach(row => tbody.appendChild(row)); rows.forEach(row => tbody.appendChild(row));
} }
function checkAllCheckboxes(tableId, checkAllCheckboxId) {
const table = document.getElementById(tableId);
const checkboxes = table.querySelectorAll('input[name="fileCheckbox"]');
const checkAllCheckbox = document.getElementById(checkAllCheckboxId);
checkboxes.forEach(checkbox => checkbox.checked = checkAllCheckbox.checked);
}

View File

@@ -1,91 +1,21 @@
class LogScraperApp { import { ScraperUtils } from './scraper_utils.js';
class ScraperApp {
constructor() { constructor() {
this.utils = new ScraperUtils();
this.form = document.getElementById('scrapingForm'); this.form = document.getElementById('scrapingForm');
this.stopButton = document.getElementById('stopButton'); this.stopButton = document.getElementById('stopButton');
this.logsElement = document.getElementById('logs');
this.prevPageButton = document.getElementById('prevPage');
this.nextPageButton = document.getElementById('nextPage');
this.pageInfo = document.getElementById('pageInfo');
this.startButton = document.getElementById('startButton'); this.startButton = document.getElementById('startButton');
this.currentPage = 0;
this.linesPerPage = null;
this.autoRefreshInterval = null;
this.init(); this.init();
} }
async init() { init() {
await this.fetchConfig(); this.utils.checkScrapingStatus();
await this.checkScrapingStatus();
this.addEventListeners(); this.addEventListeners();
} }
async fetchConfig() {
try {
const response = await fetch('/config/lines_per_page');
const data = await response.json();
this.linesPerPage = data.lines_per_page;
this.fetchLogs(this.currentPage);
} catch (error) {
console.error('Error fetching config:', error);
}
}
async fetchLogs(page) {
try {
const response = await fetch(`/logfile?page=${page}&lines_per_page=${this.linesPerPage}`);
const data = await response.json();
if (data.error) {
this.logsElement.textContent = data.error;
} else {
this.logsElement.innerHTML = data.log.map((line, index) => {
const lineNumber = data.start_line - index;
return `<span class="line-number">${lineNumber}</span> ${line}`;
}).join('');
this.updatePagination(data.total_lines);
}
} catch (error) {
console.error('Error fetching logs:', error);
}
}
updatePagination(totalLines) {
this.prevPageButton.disabled = this.currentPage === 0;
this.nextPageButton.disabled = (this.currentPage + 1) * this.linesPerPage >= totalLines;
this.pageInfo.textContent = `Page ${this.currentPage + 1} of ${Math.ceil(totalLines / this.linesPerPage)}`;
}
startAutoRefresh() {
this.autoRefreshInterval = setInterval(() => this.fetchLogs(this.currentPage), 5000);
}
stopAutoRefresh() {
clearInterval(this.autoRefreshInterval);
}
async checkScrapingStatus() {
try {
const response = await fetch('/scraping_status');
const data = await response.json();
if (data.scraping_active) {
this.startButton.disabled = true;
this.stopButton.disabled = false;
this.startAutoRefresh();
} else {
this.startButton.disabled = false;
this.stopButton.disabled = true;
}
this.fetchLogs(this.currentPage);
} catch (error) {
console.error('Error checking scraping status:', error);
}
}
async startScraping(event) { async startScraping(event) {
event.preventDefault(); event.preventDefault(); // Prevent default form submission
const formData = new FormData(this.form); const formData = new FormData(this.form);
try { try {
const response = await fetch('/start_scraping', { const response = await fetch('/start_scraping', {
@@ -93,12 +23,8 @@ class LogScraperApp {
body: formData body: formData
}); });
const data = await response.json(); const data = await response.json();
console.log(data);
if (data.status === "Scraping started") { if (data.status === "Scraping started") {
this.startButton.disabled = true; this.utils.checkScrapingStatus(); // Update UI
this.stopButton.disabled = false;
this.startAutoRefresh();
} }
} catch (error) { } catch (error) {
console.error('Error starting scraping:', error); console.error('Error starting scraping:', error);
@@ -107,14 +33,12 @@ class LogScraperApp {
async stopScraping() { async stopScraping() {
try { try {
const response = await fetch('/stop_scraping', { method: 'POST' }); const response = await fetch('/stop_scraping', {
method: 'POST'
});
const data = await response.json(); const data = await response.json();
console.log(data);
if (data.status === "Scraping stopped") { if (data.status === "Scraping stopped") {
this.startButton.disabled = false; this.utils.checkScrapingStatus(); // Update UI
this.stopButton.disabled = true;
this.stopAutoRefresh();
} }
} catch (error) { } catch (error) {
console.error('Error stopping scraping:', error); console.error('Error stopping scraping:', error);
@@ -122,23 +46,11 @@ class LogScraperApp {
} }
addEventListeners() { addEventListeners() {
this.prevPageButton.addEventListener('click', () => {
if (this.currentPage > 0) {
this.currentPage--;
this.fetchLogs(this.currentPage);
}
});
this.nextPageButton.addEventListener('click', () => {
this.currentPage++;
this.fetchLogs(this.currentPage);
});
this.form.addEventListener('submit', (event) => this.startScraping(event)); this.form.addEventListener('submit', (event) => this.startScraping(event));
this.stopButton.addEventListener('click', () => this.stopScraping()); this.stopButton.addEventListener('click', () => this.stopScraping());
} }
} }
// Initialize the application when DOM is fully loaded document.addEventListener('DOMContentLoaded', () => {
document.addEventListener('DOMContentLoaded', () => new LogScraperApp()); new ScraperApp();
});

97
app/static/log_viewer.js Normal file
View File

@@ -0,0 +1,97 @@
class LogViewerApp {
constructor() {
this.logsElement = document.getElementById('logs');
this.prevPageButton = document.getElementById('prevPage');
this.nextPageButton = document.getElementById('nextPage');
this.pageInfo = document.getElementById('pageInfo');
this.currentPage = 0;
this.linesPerPage = null;
this.autoRefreshInterval = null;
this.init();
}
async init() {
await this.fetchConfig();
await this.checkScrapingStatus();
this.addEventListeners();
}
async fetchConfig() {
try {
const response = await fetch('/config/lines_per_page');
const data = await response.json();
this.linesPerPage = data.lines_per_page;
this.fetchLogs(this.currentPage);
} catch (error) {
console.error('Error fetching config:', error);
}
}
async fetchLogs(page) {
try {
const response = await fetch(`/logfile?page=${page}&lines_per_page=${this.linesPerPage}`);
const data = await response.json();
if (data.error) {
this.logsElement.textContent = data.error;
} else {
this.logsElement.innerHTML = data.log.map((line, index) => {
const lineNumber = data.start_line - index;
return `<span class="line-number">${lineNumber}</span> ${line}`;
}).join('');
this.updatePagination(data.total_lines);
}
} catch (error) {
console.error('Error fetching logs:', error);
}
}
updatePagination(totalLines) {
this.prevPageButton.disabled = this.currentPage === 0;
this.nextPageButton.disabled = (this.currentPage + 1) * this.linesPerPage >= totalLines;
this.pageInfo.textContent = `Page ${this.currentPage + 1} of ${Math.ceil(totalLines / this.linesPerPage)}`;
}
startAutoRefresh() {
this.autoRefreshInterval = setInterval(() => this.fetchLogs(this.currentPage), 5000);
}
stopAutoRefresh() {
clearInterval(this.autoRefreshInterval);
}
async checkScrapingStatus() {
try {
const response = await fetch('/scraping_status');
const data = await response.json();
if (data.scraping_active) {
this.startAutoRefresh();
} else {
this.stopAutoRefresh();
}
this.fetchLogs(this.currentPage);
} catch (error) {
console.error('Error checking scraping status:', error);
}
}
addEventListeners() {
this.prevPageButton.addEventListener('click', () => {
if (this.currentPage > 0) {
this.currentPage--;
this.fetchLogs(this.currentPage);
}
});
this.nextPageButton.addEventListener('click', () => {
this.currentPage++;
this.fetchLogs(this.currentPage);
});
}
}
// Initialize the application when DOM is fully loaded
document.addEventListener('DOMContentLoaded', () => new LogViewerApp());

180
app/static/scraper_utils.js Normal file
View File

@@ -0,0 +1,180 @@
export class ScraperUtils {
constructor() {
this.activityIndicator = document.getElementById('activity_indicator');
this.endTimeElement = document.getElementById('end_time');
this.serverTimeElement = document.getElementById('server_time');
this.timeLeftElement = document.getElementById('time-left'); // New element for countdown
this.stopButton = document.getElementById('stopButton');
this.startButton = document.getElementById('startButton');
this.statusContainer = document.getElementById('status_container');
this.loadingIndicator = document.getElementById('loading_indicator');
this.statusContent = document.querySelectorAll('#status_content');
this.serverTime = null;
this.endTime = null;
this.init();
}
async init() {
this.showLoadingIndicator();
try {
// Ensure each function runs only once
await Promise.all([
this.updateServerTime(),
this.checkScrapingStatus()
]);
} catch (error) {
console.error("Error during initialization:", error);
}
// Ensure end time is fetched only if scraping is active
if (this.endTime === null) {
try {
await this.fetchEndTime();
} catch (error) {
console.error("Error fetching end time:", error);
}
}
// Ensure UI is only updated once everything is ready
if (this.serverTime && this.endTime) {
this.startClock();
this.hideLoadingIndicator();
} else {
console.warn("Delaying hiding the loading indicator due to missing data...");
const checkDataInterval = setInterval(() => {
if (this.serverTime && this.endTime) {
clearInterval(checkDataInterval);
this.startClock();
this.hideLoadingIndicator();
}
}, 500);
}
}
showLoadingIndicator() {
this.statusContainer.classList.remove('d-none');
this.loadingIndicator.classList.remove('d-none');
this.statusContent.forEach(element => element.classList.add('d-none'));
}
hideLoadingIndicator() {
this.loadingIndicator.classList.add('d-none');
this.statusContent.forEach(element => element.classList.remove('d-none'));
}
async checkScrapingStatus() {
try {
const response = await fetch('/scraping_status');
const data = await response.json();
if (data.scraping_active) {
if (this.startButton) this.startButton.disabled = true;
if (this.stopButton) this.stopButton.disabled = false;
this.activityIndicator.classList.remove('text-bg-danger');
this.activityIndicator.classList.add('text-bg-success');
this.activityIndicator.textContent = 'Active';
console.log(`Scraping is active until ${data.end_time} TCT`);
// Only call fetchEndTime() if endTime is not already set
if (!this.endTime) {
await this.fetchEndTime();
}
this.endTimeElement.classList.remove('d-none');
this.timeLeftElement.classList.remove('d-none');
} else {
if (this.startButton) this.startButton.disabled = false;
if (this.stopButton) this.stopButton.disabled = true;
this.activityIndicator.classList.remove('text-bg-success');
this.activityIndicator.classList.add('text-bg-danger');
this.activityIndicator.textContent = 'Inactive';
this.endTimeElement.classList.add('d-none');
this.timeLeftElement.classList.add('d-none');
}
} catch (error) {
console.error('Error checking scraping status:', error);
}
}
async updateServerTime() {
try {
const response = await fetch('/server_time');
const data = await response.json();
this.serverTime = new Date(data.server_time.replace(' ', 'T'));
this.serverTimeElement.textContent = `Server Time (TCT): ${this.formatDateToHHMMSS(this.serverTime)}`;
} catch (error) {
console.error('Error fetching server time:', error);
}
}
async fetchEndTime() {
if (this.endTime) return;
try {
const response = await fetch('/scraping_get_end_time');
const data = await response.json();
if (data.end_time) {
this.endTime = new Date(data.end_time);
this.endTimeElement.textContent = `Running until ${this.formatDateToYYYYMMDDHHMMSS(this.endTime)} TCT`;
}
} catch (error) {
this.endTimeElement.textContent = 'Error fetching end time';
console.error('Error fetching end time:', error);
}
}
startClock() {
const updateClock = () => {
if (this.serverTime) {
this.serverTime.setSeconds(this.serverTime.getSeconds() + 1);
this.serverTimeElement.textContent = `Server Time (TCT): ${this.formatDateToHHMMSS(this.serverTime)}`;
}
if (this.endTime && this.serverTime) {
const timeLeft = this.endTime - this.serverTime;
this.timeLeftElement.textContent = `Time Left: ${timeLeft > 0 ? this.formatMillisecondsToHHMMSS(timeLeft) : '00:00:00'}`;
}
};
// Immediately update the clock
updateClock();
// Continue updating every second
setInterval(updateClock, 1000);
}
formatDateToYYYYMMDDHHMMSS(date) {
if (!(date instanceof Date) || isNaN(date)) {
console.error('Invalid date:', date);
return '';
}
return `${date.getFullYear()}-${String(date.getMonth() + 1).padStart(2, '0')}-${String(date.getDate()).padStart(2, '0')} ` +
`${String(date.getHours()).padStart(2, '0')}:${String(date.getMinutes()).padStart(2, '0')}:${String(date.getSeconds()).padStart(2, '0')}`;
}
formatDateToHHMMSS(date) {
if (!(date instanceof Date) || isNaN(date)) {
console.error('Invalid date:', date);
return '';
}
return `${String(date.getHours()).padStart(2, '0')}:${String(date.getMinutes()).padStart(2, '0')}:${String(date.getSeconds()).padStart(2, '0')}`;
}
formatMillisecondsToHHMMSS(ms) {
const totalSeconds = Math.floor(ms / 1000);
const hours = Math.floor(totalSeconds / 3600);
const minutes = Math.floor((totalSeconds % 3600) / 60);
const seconds = totalSeconds % 60;
return `${String(hours).padStart(2, '0')}:${String(minutes).padStart(2, '0')}:${String(seconds).padStart(2, '0')}`;
}
}

View File

@@ -1,16 +1,100 @@
{% extends 'base.html' %} {% extends 'base.html' %}
{% block content %} {% block content %}
<section class="container-fluid d-flex justify-content-center">
<div class="container-md my-5 mx-2 shadow-lg p-4 "> <section class="container-fluid d-flex justify-content-center">
<div class="container-sm"> <div class="container-md my-5 mb-3 mx-2 shadow-lg p-4">
<div class="row"> <div class="container-sm">
<div class="col"> <div class="row">
<h2>Analyze</h2> <div class="col">
</div> <h2>User Activity Distribution</h2>
<div class="col">
</div>
</div>
</div> </div>
</div> </div>
</section> <div class="row">
{% endblock content %} <div class="col">
<form method="POST" action="{{ url_for('views.analyze') }}">
<!-- Dropdown for selecting data file -->
<label for="data_file" class="form-label">Choose Data File:</label>
<select name="data_file" id="data_file" class="form-select">
{% if data_files %}
{% for file in data_files %}
{{ file }}
{{ selected_file }}
<option value="{{ file }}" {% if file == selected_file %}selected{% endif %}>{{ file.split('/')[-1] }}</option>
{% endfor %}
{% else %}
<option disabled>No CSV files found</option>
{% endif %}
</select>
<!-- Analysis Selection Table -->
<label for="analyses" class="form-label">Select Analyses:</label>
<table id="analysesTable" class="table table-bordered table-striped">
<thead>
<tr>
<th width="2%"><input type="checkbox" id="checkAllAnalyses" class="form-check-input" onclick="checkAllCheckboxes('analysesTable', 'checkAllAnalyses')"></th>
<th>Analysis Name</th>
<th>Description</th>
</tr>
</thead>
<tbody>
{% if analyses %}
{% for analysis in analyses %}
<tr>
<td>
<input class="form-check-input" type="checkbox" name="analyses" value="{{ analysis.name }}"
{% if analysis.name in selected_analyses %}checked{% endif %}>
</td>
<td>{{ analysis.name }}</td>
<td>{{ analysis.description }}</td>
</tr>
{% endfor %}
{% else %}
<tr>
<td colspan="3" class="text-center">No analyses available</td>
</tr>
{% endif %}
</tbody>
</table>
<button type="submit" class="btn btn-primary mt-3">Run Analyses</button>
</form>
</div>
</div>
{% include 'includes/error.html' %}
</div>
</div>
</section>
{% if plot_url %}
<section class="container-fluid d-flex justify-content-center">
<div class="container-md my-1 mx-2 shadow-lg p-4">
<div class="container-sm">
<div class="row mt-4">
<div class="col">
<h4>Selected File: {{ selected_file.split('/')[-1] }}</h4>
<img src="{{ plot_url }}" class="img-fluid rounded shadow" alt="User Activity Distribution">
</div>
</div>
</div>
</div>
</section>
{% endif %}
{% if results %}
{% for analysis_name, result in results.items() %}
<section class="container-fluid d-flex justify-content-center">
<div class="container-md my-2 mx-2 shadow p-4 pt-0">
<div class="container-sm">
<div class="results mt-4">
<h3>{{ analysis_name }}</h3>
<div class="analysis-output">
{{ result | safe }} <!-- This allows HTML output -->
</div>
</div>
</div>
</div>
</section>
{% endfor %}
{% endif %}
{% endblock %}

View File

@@ -22,6 +22,9 @@
{% block content %} {% block content %}
{% endblock %} {% endblock %}
</main> </main>
<footer>
{% include 'includes/footer.html' %}
</footer>
{% block scripts %} {% block scripts %}
{% include 'includes/scripts.html' %} {% include 'includes/scripts.html' %}
{% endblock %} {% endblock %}

View File

@@ -1,68 +0,0 @@
{% extends 'base.html' %}
{% block content %}
<section class="container-fluid d-flex justify-content-center">
<div class="container-md my-5 mx-2 shadow-lg p-4 ">
<div class="container-sm">
<div class="row">
<div class="col">
<h2>User Activity Distribution</h2>
</div>
<div class="col text-end">
<!-- Dropdown for selecting data file -->
<form method="POST" action="{{ url_for('views.data_visualization') }}">
<label for="data_file" class="form-label">Choose Data File:</label>
<select name="data_file" id="data_file" class="form-select" onchange="this.form.submit()">
{% for file in data_files %}
<option value="{{ file }}" {% if file == selected_file %}selected{% endif %}>
{{ file.split('/')[-1] }}
</option>
{% endfor %}
</select>
</form>
</div>
</div>
{% if error %}
<div class="alert alert-danger mt-3" role="alert">
{{ error }}
</div>
{% endif %}
{% if plot_url %}
<div class="row mt-4">
<div class="col">
<h4>Selected File: {{ selected_file.split('/')[-1] }}</h4>
<img src="{{ plot_url }}" class="img-fluid rounded shadow" alt="User Activity Distribution">
</div>
</div>
{% endif %}
{% if statistics %}
<div class="row mt-4">
<div class="col">
<h2>Activity Statistics</h2>
<table class="table table-bordered table-hover">
<thead class="table-dark">
<tr>
<th>Hour</th>
<th>Activity Count</th>
</tr>
</thead>
<tbody>
{% for hour, count in statistics.items() %}
<tr>
<td>{{ hour }}</td>
<td>{{ count }}</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
{% endif %}
</div>
</div>
</section>
{% endblock content %}

View File

@@ -18,7 +18,7 @@
<table id="dataFilesTable" class="table table-striped table-bordered table-hover"> <table id="dataFilesTable" class="table table-striped table-bordered table-hover">
<thead> <thead>
<tr> <tr>
<th width="2%"><input type="checkbox" id="checkAllData" onclick="checkAllCheckboxes('dataFilesTable', 'checkAllData')"></th> <th width="2%"><input type="checkbox" class="form-check-input" id="checkAllData" onclick="checkAllCheckboxes('dataFilesTable', 'checkAllData')"></th>
<th onclick="sortTable(1, 'dataFilesTable')">File Name</th> <th onclick="sortTable(1, 'dataFilesTable')">File Name</th>
<th onclick="sortTable(2, 'dataFilesTable')">Last Modified</th> <th onclick="sortTable(2, 'dataFilesTable')">Last Modified</th>
<th onclick="sortTable(3, 'dataFilesTable')">Created</th> <th onclick="sortTable(3, 'dataFilesTable')">Created</th>
@@ -30,7 +30,7 @@
<tbody> <tbody>
{% for file in files.data %} {% for file in files.data %}
<tr> <tr>
<td><input type="checkbox" name="fileCheckbox" value="{{ url_for('download_data_file', filename=file.name_display) }}"{{ ' disabled' if file.active }}></td> <td><input type="checkbox" name="fileCheckbox" class="form-check-input" value="{{ url_for('download_data_file', filename=file.name_display) }}"{{ ' disabled' if file.active }}></td>
<td><a href="{{ url_for('download_data_file', filename=file.name_display) }}" target="_blank">{{ file.name_display }}</a></td> <td><a href="{{ url_for('download_data_file', filename=file.name_display) }}" target="_blank">{{ file.name_display }}</a></td>
<td>{{ file.last_modified | datetimeformat }}</td> <td>{{ file.last_modified | datetimeformat }}</td>
<td>{{ file.created | datetimeformat }}</td> <td>{{ file.created | datetimeformat }}</td>
@@ -67,7 +67,7 @@
<table id="logFilesTable" class="table table-striped table-bordered table-hover"> <table id="logFilesTable" class="table table-striped table-bordered table-hover">
<thead> <thead>
<tr> <tr>
<th width="2%"><input type="checkbox" id="checkAllLog" onclick="checkAllCheckboxes('logFilesTable', 'checkAllLog')"></th> <th width="2%"><input type="checkbox" id="checkAllLog" class="form-check-input" onclick="checkAllCheckboxes('logFilesTable', 'checkAllLog')"></th>
<th onclick="sortTable(1, 'logFilesTable')">File Name</th> <th onclick="sortTable(1, 'logFilesTable')">File Name</th>
<th onclick="sortTable(2, 'logFilesTable')">Last Modified</th> <th onclick="sortTable(2, 'logFilesTable')">Last Modified</th>
<th onclick="sortTable(3, 'logFilesTable')">Created</th> <th onclick="sortTable(3, 'logFilesTable')">Created</th>
@@ -79,7 +79,7 @@
<tbody> <tbody>
{% for file in files.log %} {% for file in files.log %}
<tr> <tr>
<td><input type="checkbox" name="fileCheckbox" value="{{ url_for('download_log_file', filename=file.name_display) }}"{{ ' disabled' if file.active }}></td> <td><input type="checkbox" name="fileCheckbox" class="form-check-input" value="{{ url_for('download_log_file', filename=file.name_display) }}"{{ ' disabled' if file.active }}></td>
<td><a href="{{ url_for('download_log_file', filename=file.name_display) }}" target="_blank">{{ file.name_display }}</a></td> <td><a href="{{ url_for('download_log_file', filename=file.name_display) }}" target="_blank">{{ file.name_display }}</a></td>
<td>{{ file.last_modified | datetimeformat }}</td> <td>{{ file.last_modified | datetimeformat }}</td>
<td>{{ file.created | datetimeformat }}</td> <td>{{ file.created | datetimeformat }}</td>
@@ -98,8 +98,5 @@
</table> </table>
</div> </div>
</section> </section>
{% block scripts %}
{{ bootstrap.load_js() }}
<script src="{{url_for('.static', filename='download_results.js')}}"></script> <script src="{{url_for('.static', filename='download_results.js')}}"></script>
{% endblock %} {% endblock %}
{% endblock content %}

View File

@@ -0,0 +1,6 @@
{% if error %}
<div class="alert alert-danger alert-dismissible fade show mt-3" role="alert">
<strong>Error:</strong> {{ error }}
<button type="button" class="btn-close" data-bs-dismiss="alert" aria-label="Close"></button>
</div>
{% endif %}

View File

@@ -1,9 +1,8 @@
<!-- app/templates/includes/navigation.html -->
<nav class="navbar navbar-nav navbar-expand-md bg-primary"> <nav class="navbar navbar-nav navbar-expand-md bg-primary">
<div class="container-fluid"> <div class="container-fluid">
<a class="navbar-brand" href="/">Torn User Activity Scraper</a> <a class="navbar-brand" href="/">{{ main_config.APP_TITLE }}</a>
{% from 'bootstrap4/nav.html' import render_nav_item %} {% from 'bootstrap4/nav.html' import render_nav_item %}
{{ render_nav_item('views.data_visualization', 'Data Visualization') }} {{ render_nav_item('views.analyze', 'Data Visualization') }}
{{ render_nav_item('download_results', 'Files') }} {{ render_nav_item('download_results', 'Files') }}
{{ render_nav_item('log_viewer', 'Logs') }} {{ render_nav_item('log_viewer', 'Logs') }}
<div class="d-flex" id="color-mode-toggle"> <div class="d-flex" id="color-mode-toggle">
@@ -15,3 +14,26 @@
</div> </div>
</div> </div>
</nav> </nav>
<div id="status_container" class="container-fluid d-flex justify-content-center">
<div class="container-md my-1 shadow p-4 pb-0 m-1 w-50" id="status_badges">
<div id="loading_indicator" class="alert alert-info">Loading...</div>
<div id="status_content">
<div class="row justify-content-center">
<div class="col col-6 p-1">
<div id="activity_indicator" class="alert alert-danger fw-bolder">Inactive</div>
</div>
<div class="col col-6 p-1">
<div id="server_time" class="alert alert-primary">Server Time (TCT):</div>
</div>
</div>
<div class="row justify-content-center">
<div class="col col-6 p-1">
<div id="end_time" class="alert alert-info">Running until:</div>
</div>
<div class="col p-1">
<div id="time-left" class="alert alert-info">Time Left:</div>
</div>
</div>
</div>
</div>
</div>

View File

@@ -1,2 +1,3 @@
{{ bootstrap.load_js() }} {{ bootstrap.load_js() }}
<script src="{{url_for('static', filename='color_mode.js')}}"></script> <script src="{{url_for('static', filename='color_mode.js')}}"></script>
<script type="module" src="{{ url_for('static', filename='common.js') }}"></script>

View File

@@ -2,7 +2,13 @@
{% block content %} {% block content %}
<section id="scrapingFormContainer" class="container-fluid d-flex justify-content-center"> <section id="scrapingFormContainer" class="container-fluid d-flex justify-content-center">
<div class="container-md my-5 mx-2 shadow-lg p-4 "> <div class="container-md my-5 mx-2 shadow-lg p-4 ">
<h2>Scraper <span id="activity_indicator" class="badge text-bg-danger">Inactive</span></h2> <div class="row">
<div class="col">
<h2>Scraper</h2>
</div>
<div class="col text-end">
</div>
</div>
<form id="scrapingForm" method="POST" action="{{ url_for('start_scraping') }}"> <form id="scrapingForm" method="POST" action="{{ url_for('start_scraping') }}">
{{ form.hidden_tag() }} {{ form.hidden_tag() }}
<div class="form-group"> <div class="form-group">
@@ -24,23 +30,5 @@
</div> </div>
</div> </div>
</section> </section>
<section id="resultsContainer" class="container-fluid d-flex justify-content-center"> <script type="module" src="{{url_for('static', filename='index.js')}}"></script>
<div class="container-md my-5 mx-2 shadow-lg p-4" style="height: 500px;">
<div class="row">
<div class="col-8">
<h2>Logs</h2>
<pre id="logs" class="pre-scrollable" style="height: 350px; overflow:scroll; "><code></code></pre>
<div class="btn-group btn-group-sm">
<button class="btn btn-primary" id="prevPage">Previous</button>
<button class="btn btn-primary" id="pageInfo" disabled>Page 1 of 1</button>
<button class="btn btn-primary" id="nextPage">Next</button>
</div>
</div>
<div class="col">
<h2>Stats</h2>
</div>
</div>
</div>
</section>
<script src="{{url_for('static', filename='index.js')}}"></script>
{% endblock content %} {% endblock content %}

View File

@@ -1,3 +1,22 @@
{% extends 'base.html' %} {% extends 'base.html' %}
{% block content %} {% block content %}
<section id="resultsContainer" class="container-fluid d-flex justify-content-center">
<div class="container-md my-5 mx-2 shadow-lg p-4" style="height: 500px;">
<div class="row">
<div class="col-8">
<h2>Logs</h2>
<pre id="logs" class="pre-scrollable" style="height: 350px; overflow:scroll;"><code></code></pre>
<div class="btn-group btn-group-sm">
<button class="btn btn-primary" id="prevPage">Previous</button>
<button class="btn btn-primary" id="pageInfo" disabled>Page 1 of 1</button>
<button class="btn btn-primary" id="nextPage">Next</button>
</div>
</div>
<div class="col">
<h2>Stats</h2>
</div>
</div>
</div>
</section>
<script src="{{url_for('static', filename='log_viewer.js')}}"></script>
{% endblock content %} {% endblock content %}

View File

@@ -1,13 +1,10 @@
import os import os
import zipfile import zipfile
from datetime import datetime, timedelta from datetime import datetime, timedelta
from flask import current_app
from app.state import data_file_name, log_file_name
from app.config import load_config from app.config import load_config
config = load_config()
def create_zip(file_paths, zip_name, app): def create_zip(file_paths, zip_name, app):
temp_dir = os.path.abspath(app.config['TEMP']['TEMP_DIR']) temp_dir = os.path.abspath(app.config['TEMP']['TEMP_DIR'])
zip_path = os.path.join(temp_dir, zip_name) zip_path = os.path.join(temp_dir, zip_name)
@@ -18,7 +15,7 @@ def create_zip(file_paths, zip_name, app):
return zip_path return zip_path
def delete_old_zips(): def delete_old_zips():
temp_dir = os.path.abspath(config['TEMP']['TEMP_DIR']) temp_dir = os.path.abspath(current_app.config['TEMP']['TEMP_DIR'])
now = datetime.now() now = datetime.now()
for filename in os.listdir(temp_dir): for filename in os.listdir(temp_dir):
if filename.endswith('.zip'): if filename.endswith('.zip'):
@@ -33,7 +30,7 @@ def tail(filename, n):
yield '' yield ''
return return
page_size = int(config['LOGGING']['TAIL_PAGE_SIZE']) page_size = int(current_app.config['LOGGING']['TAIL_PAGE_SIZE'])
offsets = [] offsets = []
count = _n = n if n >= 0 else -n count = _n = n if n >= 0 else -n

View File

@@ -5,17 +5,11 @@ from flask import render_template, Blueprint, current_app, request
from app.forms import ScrapingForm from app.forms import ScrapingForm
from app.util import get_size from app.util import get_size
from app.config import load_config from app.config import load_config
from app.api import scraper as scraper# Import the scraper instance from app.api import scraper as scraper
from app.logging_config import get_logger
from app.analysis import load_data, generate_statistics, plot_activity_distribution
from app.analysis import load_data, load_analysis_modules
from app.state import log_file_name from datetime import datetime
print(f"A imported log_file_name: {log_file_name}")
config = load_config()
logger = get_logger()
views_bp = Blueprint("views", __name__) views_bp = Blueprint("views", __name__)
@@ -29,10 +23,6 @@ def register_views(app):
def results(): def results():
return render_template('results.html') return render_template('results.html')
@app.route('/analyze')
def analyze():
return render_template('analyze.html')
@app.route('/log_viewer') @app.route('/log_viewer')
def log_viewer(): def log_viewer():
return render_template('log_viewer.html') return render_template('log_viewer.html')
@@ -47,8 +37,8 @@ def register_views(app):
if not scraper: if not scraper:
print("Scraper not initialized") print("Scraper not initialized")
data_dir = os.path.abspath(config['DATA']['DATA_DIR']) data_dir = os.path.abspath(current_app.config['DATA']['DATA_DIR'])
log_dir = os.path.abspath(config['LOGGING']['LOG_DIR']) log_dir = os.path.abspath(current_app.config['LOGGING']['LOG_DIR'])
data_files = glob.glob(os.path.join(data_dir, "*.csv")) data_files = glob.glob(os.path.join(data_dir, "*.csv"))
log_files = glob.glob(os.path.join(log_dir, "*.log")) log_files = glob.glob(os.path.join(log_dir, "*.log"))
@@ -87,42 +77,50 @@ def register_views(app):
views_bp = Blueprint("views", __name__) views_bp = Blueprint("views", __name__)
@views_bp.route("/data-visualization", methods=["GET", "POST"]) @views_bp.route("/analyze", methods=["GET", "POST"])
def data_visualization(): def analyze():
"""Route to display activity statistics with a visualization.""" analysis_modules = load_analysis_modules() # Load available analyses
data_dir = current_app.config["DATA"]["DATA_DIR"] data_dir = current_app.config.get("DATA", {}).get("DATA_DIR")
selected_file = None
selected_analyses = []
# Find all available CSV files # Find all available CSV files
data_files = sorted( data_files = sorted(
glob.glob(os.path.join(data_dir, "*.csv")), glob.glob(os.path.join(data_dir, "*.csv")),
key=os.path.getmtime, key=os.path.getmtime,
reverse=True reverse=True
) ) if data_dir else []
if not data_files: context = {
return render_template("data_visualization.html", error="No data files found.", data_files=[]) "data_files": data_files,
"analyses": analysis_modules,
"selected_file": selected_file,
"selected_analyses": selected_analyses
}
# Get the selected file from the dropdown (default to the latest file) if request.method == "POST":
selected_file = request.form.get("data_file", data_files[0] if data_files else None) selected_analyses = request.form.getlist("analyses")
selected_file = request.form.get("data_file")
if not selected_file:
context["error"] = "No file selected."
return render_template("analyze.html", **context)
if selected_file and os.path.exists(selected_file):
df = load_data(selected_file) df = load_data(selected_file)
statistics = generate_statistics(df) results = {}
# ✅ Generate the plot and get the correct URL path for analysis in analysis_modules:
# remove app/ from the base URL if analysis.name in selected_analyses:
plot_url = plot_activity_distribution(df).replace("app/", "") results[analysis.name] = analysis.execute(df) # Some may return HTML
else: context["results"] = results
return render_template("data_visualization.html", error="Invalid file selection.", data_files=data_files)
return render_template( return render_template("analyze.html", **context)
"data_visualization.html",
plot_url=plot_url,
statistics=statistics.to_dict(),
data_files=data_files,
selected_file=selected_file
)
@views_bp.route('/server_time')
def server_time():
current_time = datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S')
return {'server_time': current_time}
app.register_blueprint(views_bp) app.register_blueprint(views_bp)

View File

@@ -1,3 +1,7 @@
# All main config options will be passed to template engine
[MAIN]
APP_TITLE = 'Torn User Activity Grabber'
[DEFAULT] [DEFAULT]
SECRET_KEY = your_secret_key SECRET_KEY = your_secret_key
API_KEY = your_api_key API_KEY = your_api_key

View File

@@ -7,3 +7,5 @@ requests
matplotlib matplotlib
seaborn seaborn
configparser configparser
plotly
configobj

View File

@@ -14,6 +14,8 @@ charset-normalizer==3.4.1
# via requests # via requests
click==8.1.8 click==8.1.8
# via flask # via flask
configobj==5.0.9
# via -r requirements.in
configparser==7.1.0 configparser==7.1.0
# via -r requirements.in # via -r requirements.in
contourpy==1.3.1 contourpy==1.3.1
@@ -48,6 +50,8 @@ matplotlib==3.10.0
# via # via
# -r requirements.in # -r requirements.in
# seaborn # seaborn
narwhals==1.26.0
# via plotly
numpy==2.2.2 numpy==2.2.2
# via # via
# contourpy # contourpy
@@ -55,13 +59,17 @@ numpy==2.2.2
# pandas # pandas
# seaborn # seaborn
packaging==24.2 packaging==24.2
# via matplotlib # via
# matplotlib
# plotly
pandas==2.2.3 pandas==2.2.3
# via # via
# -r requirements.in # -r requirements.in
# seaborn # seaborn
pillow==11.1.0 pillow==11.1.0
# via matplotlib # via matplotlib
plotly==6.0.0
# via -r requirements.in
pyparsing==3.2.1 pyparsing==3.2.1
# via matplotlib # via matplotlib
python-dateutil==2.9.0.post0 python-dateutil==2.9.0.post0

4
run.py
View File

@@ -1,5 +1,5 @@
from app.app import init_app from app import create_app
if __name__ == '__main__': if __name__ == '__main__':
app = init_app() app = create_app()
app.run(debug=True, threaded=True) app.run(debug=True, threaded=True)