** Run all the following as the analytics user. **

0) Deploy new version of refinery as per https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Deploy/Refinery-source

1) open sqark2-sql and run:

ALTER TABLE `wmf`.`editors_daily` CHANGE COLUMN
  user_fingerprint_or_id user_fingerprint_or_name STRING COMMENT 'If an anonymous user, this is a hash of the IP + UA, otherwise it is their global username across wiki dbs'
;

now run:

	show partitions wmf.editors_daily;
	exit;

example output:
	spark-sql (default)> show partitions wmf.editors_daily;
	partition
	month=2022-06
	month=2022-07


2) For each month partition shown above, run the following HQL that will INSERT OVERWRITE the partition with valid data. Each run should take 2-3 minutes:


spark2-sql --master yarn --executor-memory 8G --executor-cores 4 --driver-memory 2G --conf spark.dynamicAllocation.maxExecutors=64 \
           -f editors_daily_monthly.hql                                                       \
           -d refinery_hive_jar=hdfs://analytics-hadoop/wmf/cache/artifacts/airflow/refinery-hive-0.1.27-shaded.jar  \
           -d source_table=wmf_raw.mediawiki_private_cu_changes                               \
           -d user_history_table=wmf.mediawiki_user_history                                   \
           -d destination_table=wmf.editors_daily                                             \
           -d month=2022-06                                                                   \
           -d coalesce_partitions=1

3) Now let's create the new monthly table:

spark2-sql -f create_editors_by_country_monthly_table.hql.hql   \
           --database wmf


Finally, run the following for the same partition sets we found above in step (1) to populate the new editors_by_country_monthly_table table. Each run should take 2-3 minutes:

spark2-sql --master yarn --executor-memory 8G --executor-cores 4 --driver-memory 2G --conf spark.dynamicAllocation.maxExecutors=64 \
           -f unique_editors_by_country_monthly.hql \
           -d source_table=wmf.editors_daily \
           -d destination_table=wmf.unique_editors_by_country_monthly \
           -d month=2022-06 \
           -d coalesce_partitions=1