module SiteStandards
Defines partial standards for Apache website checker TODO better document individual scans with specific policies
Constants
- CHECK_CAPTURE
- CHECK_DOC
- CHECK_POLICY
- CHECK_TEXT
- CHECK_TYPE
- CHECK_VALIDATE
- COMMON_CHECKS
- 
          Checks done for all podlings|projects 
- CSP_INFRA_BASE
- CSP_PROJECT_DOMAINS
- CSP_THIRD_PARTY
- DEFAULT_CSP
- DEFAULT_CSP_RE
- PODLING_CHECKS
- 
          Checks done only for Incubator podlings 
- SITE_FAIL
- SITE_PASS
- SITE_WARN
- TLP_CHECKS
- 
          Checks done only for TLPs (i.e. not podlings) 
- WWW_CSP
- 
          CSP for main website 
Public Instance Methods
Source
# File lib/whimsy/sitestandards.rb, line 256 def _validate(site, match, key) # return method(match).call(site, key) if match.is_a? Symbol return site =~ match end
Source
# File lib/whimsy/sitestandards.rb, line 309 def analyze(sites, checks) process_csp sites success = Hash.new { |h, k| h[k] = Hash.new(&h.default_proc) } counts = Hash.new { |h, k| h[k] = Hash.new(&h.default_proc) } checks.each do |nam, check_data| success[nam] = sites.select { |key, site| _validate(site[nam], check_data[SiteStandards::CHECK_VALIDATE], key) }.keys counts[nam][SITE_PASS] = success[nam].count counts[nam][SITE_WARN] = 0 # Reorder output counts[nam][SITE_FAIL] = sites.select { |_, site| site[nam].nil? }.count counts[nam][SITE_WARN] = sites.size - counts[nam][SITE_PASS] - counts[nam][SITE_FAIL] end return [ counts, { SITE_PASS => '# Sites with links to primary ASF page', SITE_WARN => '# Sites with link, but not an expected ASF one', SITE_FAIL => '# Sites with no link for this topic' }, success ] end
Analyze data returned from site-scan.rb by using checks regex
If value =~ CHECK_VALIDATE, SITE_PASS If value is present (presumably from CHECK_TEXT|CAPTURE), then SITE_WARN If value not present, SITE_FAIL (i.e. site-scan.rb didn't find it)
@param sites hash of site-scan data collected @param checks to apply to sites to determine status @return [overall counts, description of statuses, success listings] called by site_or_pod.rb
Source
# File lib/whimsy/sitestandards.rb, line 181 def get_checks(tlp = true) tlp ? (return TLP_CHECKS.merge(COMMON_CHECKS)) : (return PODLING_CHECKS.merge(COMMON_CHECKS)) end
Get hash of checks to be done for tlp | podling @param tlp true if project; podling otherwise
Source
# File lib/whimsy/sitestandards.rb, line 187 def get_filename(tlp = true) tlp ? (return 'site-scan.json') : (return 'pods-scan.json') end
Get filename of check data for tlp | podling @param tlp true if project; podling otherwise
Source
# File lib/whimsy/sitestandards.rb, line 200 def get_sites(tlp = true) local_copy = File.expand_path("#{get_url(true)}#{get_filename(tlp)}", __FILE__) begin if File.exist? local_copy crawl_time = File.mtime(local_copy).httpdate # show time in same format as last-mod sites = JSON.parse(File.read(local_copy, :encoding => 'utf-8')) else require 'wunderbar' Wunderbar.warn "Failed to find local copy #{local_copy}" local_copy = "#{get_url(false)}#{get_filename(tlp)}" response = Net::HTTP.get_response(URI(local_copy)) crawl_time = response['last-modified'] sites = JSON.parse(response.body) end rescue StandardError => e require 'wunderbar' Wunderbar.warn "Failed to parse #{local_copy}: #{e.inspect} #{e.backtrace.join("\n\t")}" sites = {} end return sites, crawl_time end
Get check data for tlp | podling
Uses a local_copy if available; w.a.o/public otherwise
@param tlp true if project; podling otherwise @return [hash of site data, crawl_time]
Source
# File lib/whimsy/sitestandards.rb, line 192 def get_url(is_local = true) is_local ? (return '../../../www/public/') : (return 'https://whimsy.apache.org/public/') end
Get URL to default filename location on server
Source
# File lib/whimsy/sitestandards.rb, line 164 def label(analysis, links, col, name) if not links[col] # Non-PMCs don't need images if col == 'image' and (name == 'attic' or links['nonpmc']) SITE_PASS else SITE_FAIL end elsif analysis[2].include? col and not analysis[2][col].include? name SITE_WARN else SITE_PASS end end
Determine the color of a given table cell, given:
- overall analysis of the sites, in particular the third column which is a list projects that successfully matched the check - list of links for the project in question - the column in question (which indicates the check being reported on) - the name of the project
Source
# File lib/whimsy/sitestandards.rb, line 276 def process_csp(sites) sites.each do |site, data| csp = data.fetch('csp', '') squashed = csp.gsub(/ +/, ' ') m = DEFAULT_CSP_RE.match(squashed) if m # the syntax of the CSP appears to be OK extras = m.captures.uniq if extras.size == 1 extra1 = extras.first if extra1 == '' data['csp_check'] = 'OK' else data['csp_check'] = "Extras: #{extra1}" end else data['csp_check'] = "Mixed Extras - should not happen: #{extras}" end elsif data['nonpmc'] and data['uri'] =~ %r{^https://(www\.)?apache\.org/} and squashed == WWW_CSP data['csp_check'] = 'OK' else # did not match data['csp_check'] = "Invalid: #{squashed}" end end end
process csp entries to return
- 
exact match 
- 
has overrides 
- 
invalid