source: products/quintagroup.plonegooglesitemaps/trunk/quintagroup/plonegooglesitemaps/filters.txt @ 3002

Last change on this file since 3002 was 3002, checked in by mylan, 13 years ago

Merged revisions 3948,3950-3951,3954-3964,3978-3981,3984-3992,4016-4018,4028-4037,4039 via svnmerge from
http://svn.quintagroup.com/products/quintagroup.plonegooglesitemaps/branches/blacklist

........

r3948 | mylan | 2010-10-21 16:34:22 +0300 (Thu, 21 Oct 2010) | 1 line


#228: Added IBlackoutFilterUtility interface with skeleton of id and path utilities with appropriate tests.

........

r3950 | mylan | 2010-10-22 11:58:53 +0300 (Fri, 22 Oct 2010) | 1 line


#228: Added tests for default id and path filters

........

r3951 | mylan | 2010-10-22 11:59:25 +0300 (Fri, 22 Oct 2010) | 1 line


#228: Added default id and path blackout filters

........

r3954 | mylan | 2010-10-22 14:38:20 +0300 (Fri, 22 Oct 2010) | 1 line


#228: Added TestFormDataProcessing? tests

........

r3955 | mylan | 2010-10-22 14:38:57 +0300 (Fri, 22 Oct 2010) | 1 line


#228: Added getBOFiltered method to sitemap common view

........

r3956 | mylan | 2010-10-22 15:17:40 +0300 (Fri, 22 Oct 2010) | 1 line


#228: Simplify blacklists filter utility tests

........

r3957 | mylan | 2010-10-22 15:23:24 +0300 (Fri, 22 Oct 2010) | 1 line


#228: Minor name updates in blackout list tests

........

r3958 | mylan | 2010-10-22 15:54:10 +0300 (Fri, 22 Oct 2010) | 1 line


#228: Added tests for relative path in path-filter

........

r3959 | mylan | 2010-10-22 15:54:50 +0300 (Fri, 22 Oct 2010) | 1 line


#228: Added relative path processing in default path-filter

........

r3960 | mylan | 2010-10-22 15:57:18 +0300 (Fri, 22 Oct 2010) | 1 line


#228: Fixed name of the relative path filtering test

........

r3961 | mylan | 2010-10-25 13:51:32 +0300 (Mon, 25 Oct 2010) | 1 line


#228: Fix relative path filter tests and utility

........

r3962 | mylan | 2010-10-25 17:38:55 +0300 (Mon, 25 Oct 2010) | 1 line


#228: Fixed default Path filter

........

r3963 | mylan | 2010-10-25 17:49:27 +0300 (Mon, 25 Oct 2010) | 1 line


#228: Added request to extended list of args for a filter utility.

........

r3964 | mylan | 2010-10-25 17:50:57 +0300 (Mon, 25 Oct 2010) | 1 line


#228: Fixed rewrited blackout filter utility and functionality.

........

r3978 | mylan | 2010-10-28 19:26:11 +0300 (Thu, 28 Oct 2010) | 1 line


#228: Remake filter utility to multiadapter, fix tests

........

r3979 | mylan | 2010-10-28 19:57:24 +0300 (Thu, 28 Oct 2010) | 1 line


#228: Fix filter naming overhead, fix tests

........

r3980 | mylan | 2010-10-28 21:10:51 +0300 (Thu, 28 Oct 2010) | 1 line


#228: Fix overhead in parsing of filter arguments. Added test (breakage yet)

........

r3981 | mylan | 2010-10-29 12:57:34 +0300 (Fri, 29 Oct 2010) | 1 line


#228: Add blackout_list value clean-up on editing sitemap

........

r3984 | mylan | 2010-10-29 13:53:54 +0300 (Fri, 29 Oct 2010) | 1 line


#228: Update description of blackout list, remove preparations to filtering

........

r3985 | mylan | 2010-10-29 16:14:29 +0300 (Fri, 29 Oct 2010) | 1 line


#228: Added doc tests bases

........

r3986 | mylan | 2010-10-29 16:53:41 +0300 (Fri, 29 Oct 2010) | 1 line


#228: Added basic doctests of filtering

........

r3987 | mylan | 2010-10-29 16:57:40 +0300 (Fri, 29 Oct 2010) | 1 line


#228: Minor fixes of filtering doctests.

........

r3988 | mylan | 2010-11-01 13:31:04 +0200 (Mon, 01 Nov 2010) | 1 line


#228: Added doctests of id, path, default filters

........

r3989 | mylan | 2010-11-01 15:20:03 +0200 (Mon, 01 Nov 2010) | 1 line


#228: Fixed default behavior filter doctests

........

r3990 | mylan | 2010-11-01 15:20:51 +0200 (Mon, 01 Nov 2010) | 1 line


#228: Added example of new filter creation in doctests

........

r3991 | mylan | 2010-11-01 15:35:10 +0200 (Mon, 01 Nov 2010) | 1 line


#228: Clean-up, simplify filters doctests

........

r3992 | mylan | 2010-11-01 15:39:55 +0200 (Mon, 01 Nov 2010) | 1 line


#228: Force to show all doctests failures.

........

r4016 | mylan | 2010-11-05 14:19:46 +0200 (Fri, 05 Nov 2010) | 1 line


#228: Fix incorrect black_list field editing in tests

........

r4017 | mylan | 2010-11-05 14:21:35 +0200 (Fri, 05 Nov 2010) | 1 line


#228: minor docstring update for getBOFiltered method

........

r4018 | mylan | 2010-11-05 14:23:56 +0200 (Fri, 05 Nov 2010) | 1 line


#228: Remake filterOut method of into generator for default filters

........

r4028 | mylan | 2010-11-08 13:58:10 +0200 (Mon, 08 Nov 2010) | 1 line


#228: Added Plone v3.0 support

........

r4029 | mylan | 2010-11-08 16:35:22 +0200 (Mon, 08 Nov 2010) | 1 line


#228: Fixed tests issues

........

r4030 | mylan | 2010-11-08 16:49:28 +0200 (Mon, 08 Nov 2010) | 1 line


#228: Fixed ran of tearing down testing layers

........

r4031 | mylan | 2010-11-08 17:04:01 +0200 (Mon, 08 Nov 2010) | 1 line


#228: Fixed differencies in forms for plone<=3.1 and plone>3.1

........

r4032 | mylan | 2010-11-10 13:48:47 +0200 (Wed, 10 Nov 2010) | 1 line


#228: Switch to use collective.testcaselayer

........

r4033 | mylan | 2010-11-10 13:49:09 +0200 (Wed, 10 Nov 2010) | 1 line


#228: Fix doctest testcase

........

r4034 | mylan | 2010-11-10 14:34:17 +0200 (Wed, 10 Nov 2010) | 1 line


#228: Fixed list of required packages for testing

........

r4035 | mylan | 2010-11-11 12:29:06 +0200 (Thu, 11 Nov 2010) | 1 line


#228: remake filters to generators

........

r4036 | mylan | 2010-11-11 15:16:53 +0200 (Thu, 11 Nov 2010) | 1 line


#228: reviewed explanation, correct grammar for filters doctest

........

r4037 | mylan | 2010-11-11 15:19:24 +0200 (Thu, 11 Nov 2010) | 1 line


#228: updated histroy, bumped version to 1.6.0

........

r4039 | mylan | 2010-11-11 19:03:01 +0200 (Thu, 11 Nov 2010) | 1 line


#228:Fixed typo in doctests

........

  • Property svn:eol-style set to native
File size: 10.5 KB
RevLine 
[2947]1
2Blackout filtering
3==================
4
[2997]5Filtering introductioin
6=======================
[2947]7
[2997]8Sitemap portal type has an option, designed to filter out
9objects which should be excluded from a sitemap. This option is
10accessable in sitemap edit form and is labeled as
11"Blackout entries".
[2947]12
[2997]13In earlier versions of the package (<4.0.1 for plone-4 branch
14and <3.0.7 for plone-3 branch) this field allowed to
15filter objects only by their ids, and looked like:
16
[2947]17<pre>
18  index.html
19  index_html
20</pre>
21
[2997]22So, all objects with "index.html" or "index_html" ids were
23excluded from the sitemap.
[2947]24
[2997]25In the new versions of GoogleSitemaps filtering was remade
26to pluggable architecture. Now filters became named multi
27adapters. There are only two default filters - "id" and
[2947]28"path".
29
[2997]30Since different filters can be used - new syntax was applied
[2947]31to the "Blackout entries" field. Every record in the field
[2997]32should follow the specification:
[2947]33 
34  [<filter name>:]<filter arguments>
35
[2997]36If no <filter name> is specified - "id" filter will
37be used. If <filter name> is specified - system will look
38for <filter name>-named  multiadapter to IBlackoutFilter
39interface. If such multiadapter is not found - filter
40will be ignored without raising any errors.
[2947]41
42
[2997]43Following parts demonstrate working of the filtering.
44Aspects of default filters ("id" and "path")
45are considered yet.
[2947]46
[2997]47Setup demonstration environment
48===============================
[2947]49
[2997]50First, we must perform some setup. We use the testbrowser that is
51shipped with Five, as this provides proper Zope 2 integration. Most
52of the documentation, though, is in the underlying zope.testbrower
53package.
54
[2947]55    >>> from Products.Five.testbrowser import Browser
56    >>> browser = Browser()
57    >>> portal_url = self.portal.absolute_url()
58
[2997]59The following is useful when writing and debugging testbrowser tests.
60It lets us see all error messages in the error_log.
[2947]61
62    >>> self.portal.error_log._ignored_exceptions = ()
63
[2997]64With that in place, we can go to the portal front page and log in.
65We will do this using the default user from PloneTestCase:
[2947]66
67    >>> from Products.PloneTestCase.setup import portal_owner, default_password
68    >>> browser.open(portal_url)
69
70We have the login portlet, so let's use that.
71
72    >>> browser.open('http://nohost/plone/login_form')
73    >>> browser.getLink('Log in').click()
74    >>> browser.url
75    'http://nohost/plone/login_form'
76    >>> browser.getControl('Login Name').value = portal_owner
77    >>> browser.getControl('Password').value = default_password
78    >>> browser.getControl('Log in').click()
79    >>> "You are now logged in" in browser.contents
80    True
81    >>> "Login failed" in browser.contents
82    False
83    >>> browser.url
84    'http://nohost/plone/login_form'
85
86
87Functionality
88=============
89
[2997]90First create some content for demonstrations.
[2947]91
92In the root of the portal
93
94    >>> self.addDocument(self.portal, "doc1", "Document 1 text")
95    >>> self.addDocument(self.portal, "doc2", "Document 2 text")
96
97And in the memeber's folder
98
99    >>> self.addDocument(self.folder, "doc1", "Member Document 1 text")
100    >>> self.addDocument(self.folder, "doc2", "Member Document 2 text")
101
[2997]102We need to add sitemap for demonstration.
[2947]103
104    >>> browser.open(portal_url + "/prefs_gsm_settings")
105    >>> browser.getControl('Add Content Sitemap').click()
106   
107Now we bring-up to edit form of the newly created content sitemap.
[2997]108We are interested in two things: "Blackout entries" field must
109present in the form and by default it should be empty.
110
[2949]111    >>> file("/tmp/browser.0.html","wb").write(browser.contents)
[2947]112    >>> blackout_list = browser.getControl("Blackout entries")
113    >>> blackout_list
114    <Control name='blackout_list:lines' type='textarea'>
[2949]115    >>> blackout_list.value == ""
116    True
[2948]117    >>> save_button = browser.getControl("Save")
[2947]118    >>> save_button
[2992]119    <SubmitControl name='form...' type='submit'>
[2948]120    >>> save_button.click()
[2947]121
122
[2949]123Click on "Save" button lead us to result sitemap view.
[2947]124
[2950]125    >>> print browser.contents
126    <?xml version="1.0" encoding=...
[2947]127
[2950]128
[2997]129"sitemap.xml" link should appear in "Settings" page of the
130Plone Google Sitemap configlet when "Content Sitemap"
131was added.
[2947]132
[2949]133    >>> browser.open(portal_url + "/prefs_gsm_settings")
134    >>> smedit_link = browser.getLink('sitemap.xml')
[2950]135    >>> smedit_url = smedit_link.url
[2947]136
[2997]137This link points to edit form of the newly created sitemap.xml.
138Let prepare view link to simplifier following demonstrations.
[2947]139
[2950]140    >>> smedit_url.endswith("sitemap.xml/edit")
[2949]141    True
[2950]142    >>> smview_url = smedit_url[:-5]
[2949]143
144
145No filters
146==========
147
[2997]148Created sitemap has no filters and all documents should appear in it.
[2949]149
[2950]150    >>> browser.open(smview_url)
[2949]151    >>> file("/tmp/browser.1.html","wb").write(browser.contents)
152    >>> no_filters_content = browser.contents
153
[2997]154Check if resulted page really is sitemap...
[2949]155
[2950]156    >>> print browser.contents
157    <?xml version="1.0" encoding=...
[2949]158
[2950]159
[2997]160Create regular expression, which help us to test which urls pass the filters.
[2949]161
162    >>> reloc = re.compile("<loc>%s([^\<]*)</loc>" % self.portal.absolute_url(), re.S)
163
[2997]164Test if all 4 documents and default front-page present in the sitemap
165without filters.
[2949]166
167    >>> no_filters_res = reloc.findall(no_filters_content)
168    >>> no_filters_res.sort()
169    >>> print "\n".join(no_filters_res)
170    /Members/test_user_1_/doc1
171    /Members/test_user_1_/doc2
172    /doc1
173    /doc2
174    /front-page
175
176
177Check "id" filter
178=================
179
180Go to the edit form of the sitemap and add "doc1"
181and "front-page" lines with "id:" prefix to the
182"Blackout entries" field.
183
[2950]184    >>> browser.open(smedit_url)
[2949]185    >>> filtercontrol = browser.getControl("Blackout entries")
[2952]186    >>> filtercontrol.value = """
187    ...     id:doc1
188    ...     id:front-page
189    ... """
[2949]190    >>> browser.getControl("Save").click()
191    >>> id_filter_content = browser.contents
192
[2997]193"doc1" and "front-page" documents should be excluded from the
194sitemap.
[2949]195
196    >>> id_filter_res = reloc.findall(id_filter_content)
197    >>> id_filter_res.sort()
198    >>> print "\n".join(id_filter_res)
199    /Members/test_user_1_/doc2
200    /doc2
201
202
203Check "path" filter
204===================
205
[2997]206Suppouse we wont to exclude the "front_page" from portal root
207and "doc2" document, located in test_user_1_ home folder,
208but leave untouched "doc2" in portal root with all other objects.
[2949]209
[2950]210    >>> browser.open(smedit_url)
[2949]211    >>> filtercontrol = browser.getControl("Blackout entries")
[2952]212    >>> filtercontrol.value = """
213    ...    path:/Members/test_user_1_/doc2
214    ...    path:/front-page
215    ... """
[2949]216    >>> browser.getControl("Save").click()
217    >>> path_filter_content = browser.contents
218
[2997]219"/Members/test_user_1_/doc2" and "/front_page" objects should
220be excluded from the sitemap.
[2949]221
222    >>> path_filter_res = reloc.findall(path_filter_content)
223    >>> path_filter_res.sort()
224    >>> print "\n".join(path_filter_res)
[3000]225    /Members/test_user_1_/doc1
[2949]226    /doc1
227    /doc2
228
229
230Check default filter
231====================
232
[2997]233Now I have the question: "What filter will be used when no
234filter name prefix was specified (old-fashion filters for
235example)?"
[2949]236
[2997]237Go to the edit form of the sitemap and add "doc1" and
238"front-page" lines without any filter name prefix to the
239"Blackout entries" field.
[2949]240
241    >>> browser.open(portal_url + "/sitemap.xml/edit")
242    >>> filtercontrol = browser.getControl("Blackout entries")
[2952]243    >>> filtercontrol.value = """
244    ...     doc1
245    ...     front-page
246    ... """
[2949]247    >>> browser.getControl("Save").click()
248    >>> default_filter_content = browser.contents
249
[2997]250"id" filter must be used as default filter. So all "doc1" and
251"front-page" objects should be excluded from the sitemap.
[2949]252
253    >>> default_filter_res = reloc.findall(default_filter_content)
254    >>> default_filter_res.sort()
255    >>> print "\n".join(default_filter_res)
256    /Members/test_user_1_/doc2
257    /doc2
258
259
[2951]260Creation own filters
261====================
262
263Suppouse we want to create own blackout filter,
[2997]264which behave like id-filter, but has some differencies.
[2951]265Our fitler has following format:
266
267  (+|-)<filtered id>
268
[2997]269  - when 1st sign is "+" then only objects with <filtered id>
270    should be leaved in sitemap after filetering;
271  - if 1st sign is "-" then all objects with <filtered id>
272    should be excluded from the sitemap (like default id
273    filter).
[2951]274
275You need create new IBlckoutFilter multi-adapter,
276and register it with unique name.
277
278    >>> from zope.component import adapts
279    >>> from zope.interface import Interface, implements
280    >>> from zope.publisher.interfaces.browser import IBrowserRequest
281    >>> from quintagroup.plonegooglesitemaps.interfaces import IBlackoutFilter
282    >>> class SignedIdFilter(object):
283    ...     adapts(Interface, IBrowserRequest)
284    ...     implements(IBlackoutFilter)
285    ...     def __init__(self, context, request):
286    ...         self.context = context
287    ...         self.request = request
288    ...     def filterOut(self, fdata, fargs):
289    ...         sign = fargs[0]
290    ...         fid = fargs[1:]
291    ...         if sign == "+":
292    ...             return [b for b in fdata if b.getId==fid]
293    ...         elif sign == "-":
294    ...             return [b for b in fdata if b.getId!=fid]
295    ...         return fdata
296
297
298Now register this new filter as named multiadapter ...
299
300    >>> from zope.component import provideAdapter
301    >>> provideAdapter(SignedIdFilter,
302    ...                name=u'signedid')
303
[2997]304So that's all what needed to add new filter.
305Now test newly created filter.
[2951]306
[2997]307Check whether white filtering ("+" prefix) works correctly.
308Go to edit form of the sitemap and add "signedid:+doc1"
[2951]309to the "Blackout entries" field.
310
311    >>> browser.open(smedit_url)
312    >>> filtercontrol = browser.getControl("Blackout entries")
[2952]313    >>> filtercontrol.value = """
314    ...    signedid:+doc1
315    ... """
[2951]316    >>> browser.getControl("Save").click()
317    >>> signedid_filter_content = browser.contents
318
[2997]319Only objects with "doc1" id should be leaved in the sitemap.
[2951]320
321    >>> signedid_filter_res = reloc.findall(signedid_filter_content)
322    >>> signedid_filter_res.sort()
323    >>> print "\n".join(signedid_filter_res)
324    /Members/test_user_1_/doc1
325    /doc1
326
327
[2997]328And for the last - check wheter black filtering ("-" prefix)
329works correctly.
[2951]330Go to the edit form of the sitemap and add "signedid:-doc1"
331to the "Blackout entries" field.
332
333    >>> browser.open(smedit_url)
334    >>> filtercontrol = browser.getControl("Blackout entries")
[2952]335    >>> filtercontrol.value = """
336    ...     signedid:-doc1
337    ... """
[2951]338    >>> browser.getControl("Save").click()
339    >>> signedid_filter_content = browser.contents
340
[2997]341All objects except those having "doc1" id must be included in
342the sitemap.
[2951]343
344    >>> signedid_filter_res = reloc.findall(signedid_filter_content)
345    >>> signedid_filter_res.sort()
346    >>> print "\n".join(signedid_filter_res)
347    /Members/test_user_1_/doc2
348    /doc2
349    /front-page
Note: See TracBrowser for help on using the repository browser.