1 | |
---|
2 | Blackout filtering |
---|
3 | ================== |
---|
4 | |
---|
5 | Sitemaps has option to filterout objects, which shouldn't present |
---|
6 | in a sitemap. This option is accessable in sitemap edit form |
---|
7 | and present as "Blackout entries" lines field. |
---|
8 | |
---|
9 | In earlier (<3.0.7 and <4.0.1) versions of the package the field |
---|
10 | filter-out objects only by its ids, and looks like: |
---|
11 | |
---|
12 | <pre> |
---|
13 | index.html |
---|
14 | index_html |
---|
15 | </pre> |
---|
16 | |
---|
17 | So all objects with "index.html" or "index_html" ids excluded |
---|
18 | from the sitemap. |
---|
19 | |
---|
20 | In the new versions of GoogleSitemaps filtering was remaked |
---|
21 | to pluggable architecture. Now filters are named mutli adapters. |
---|
22 | By default there are only two most useful filters - "id" and |
---|
23 | "path". |
---|
24 | |
---|
25 | Because of different filters can be used - new syntax applied |
---|
26 | to the "Blackout entries" field. Every record in the field |
---|
27 | should follow the spec: |
---|
28 | |
---|
29 | [<filter name>:]<filter arguments> |
---|
30 | |
---|
31 | By default (if no <filter name> specified) - "id" filter will |
---|
32 | be used. If <filter name> specified - system looking for |
---|
33 | <filter name> name multiadapter to IBlackoutFilter interface. |
---|
34 | If such multiadapter was not found - it's ignored silently. |
---|
35 | |
---|
36 | |
---|
37 | Setup |
---|
38 | ===== |
---|
39 | |
---|
40 | First, we must perform some setup. We use the testbrowser that is shipped |
---|
41 | with Five, as this provides proper Zope 2 integration. Most of the |
---|
42 | documentation, though, is in the underlying zope.testbrower package. |
---|
43 | |
---|
44 | >>> from Products.Five.testbrowser import Browser |
---|
45 | >>> browser = Browser() |
---|
46 | >>> portal_url = self.portal.absolute_url() |
---|
47 | |
---|
48 | The following is useful when writing and debugging testbrowser tests. It lets |
---|
49 | us see all error messages in the error_log. |
---|
50 | |
---|
51 | >>> self.portal.error_log._ignored_exceptions = () |
---|
52 | |
---|
53 | With that in place, we can go to the portal front page and log in. We will |
---|
54 | do this using the default user from PloneTestCase: |
---|
55 | |
---|
56 | >>> from Products.PloneTestCase.setup import portal_owner, default_password |
---|
57 | >>> browser.open(portal_url) |
---|
58 | |
---|
59 | We have the login portlet, so let's use that. |
---|
60 | |
---|
61 | >>> browser.open('http://nohost/plone/login_form') |
---|
62 | >>> browser.getLink('Log in').click() |
---|
63 | >>> browser.url |
---|
64 | 'http://nohost/plone/login_form' |
---|
65 | >>> browser.getControl('Login Name').value = portal_owner |
---|
66 | >>> browser.getControl('Password').value = default_password |
---|
67 | >>> browser.getControl('Log in').click() |
---|
68 | >>> "You are now logged in" in browser.contents |
---|
69 | True |
---|
70 | >>> "Login failed" in browser.contents |
---|
71 | False |
---|
72 | >>> browser.url |
---|
73 | 'http://nohost/plone/login_form' |
---|
74 | |
---|
75 | |
---|
76 | |
---|
77 | Functionality |
---|
78 | ============= |
---|
79 | |
---|
80 | First create several documents for demonstrations. |
---|
81 | |
---|
82 | In the root of the portal |
---|
83 | |
---|
84 | >>> self.addDocument(self.portal, "doc1", "Document 1 text") |
---|
85 | >>> self.addDocument(self.portal, "doc2", "Document 2 text") |
---|
86 | |
---|
87 | And in the memeber's folder |
---|
88 | |
---|
89 | >>> self.addDocument(self.folder, "doc1", "Member Document 1 text") |
---|
90 | >>> self.addDocument(self.folder, "doc2", "Member Document 2 text") |
---|
91 | |
---|
92 | We need add sitemap, of corse, for demonstration. |
---|
93 | |
---|
94 | >>> browser.open(portal_url + "/prefs_gsm_settings") |
---|
95 | >>> browser.getControl('Add Content Sitemap').click() |
---|
96 | |
---|
97 | Now we bring-up to edit form of the newly created content sitemap. |
---|
98 | We interested in two things: "Blackout entries" field must present |
---|
99 | in the form and it should be empty by default. |
---|
100 | |
---|
101 | |
---|
102 | >>> file("/tmp/browser.0.html","wb").write(browser.contents) |
---|
103 | >>> blackout_list = browser.getControl("Blackout entries") |
---|
104 | >>> blackout_list |
---|
105 | <Control name='blackout_list:lines' type='textarea'> |
---|
106 | >>> blackout_list.value == "" |
---|
107 | True |
---|
108 | >>> save_button = browser.getControl("Save") |
---|
109 | >>> save_button |
---|
110 | <SubmitControl name='form.button.save' type='submit'> |
---|
111 | >>> save_button.click() |
---|
112 | |
---|
113 | |
---|
114 | Click on "Save" button lead us to result sitemap view. |
---|
115 | |
---|
116 | >>> print browser.contents |
---|
117 | <?xml version="1.0" encoding=... |
---|
118 | |
---|
119 | |
---|
120 | After adding "Content Sitemap", "sitemap.xml" link will appear |
---|
121 | on "Settings" tab page of Plone Google Sitemap configlet. |
---|
122 | |
---|
123 | >>> browser.open(portal_url + "/prefs_gsm_settings") |
---|
124 | >>> smedit_link = browser.getLink('sitemap.xml') |
---|
125 | >>> smedit_url = smedit_link.url |
---|
126 | |
---|
127 | This link lead to edit form of the newly created sitemap.xml. |
---|
128 | Also prepare view link to simplifier following demonstrations. |
---|
129 | |
---|
130 | >>> smedit_url.endswith("sitemap.xml/edit") |
---|
131 | True |
---|
132 | >>> smview_url = smedit_url[:-5] |
---|
133 | |
---|
134 | |
---|
135 | No filters |
---|
136 | ========== |
---|
137 | |
---|
138 | Resulted sitemap has no filters - all document should present in it. |
---|
139 | |
---|
140 | >>> browser.open(smview_url) |
---|
141 | >>> file("/tmp/browser.1.html","wb").write(browser.contents) |
---|
142 | >>> no_filters_content = browser.contents |
---|
143 | |
---|
144 | Check if resulted page is real sitemap... |
---|
145 | |
---|
146 | >>> print browser.contents |
---|
147 | <?xml version="1.0" encoding=... |
---|
148 | |
---|
149 | |
---|
150 | To check urls, which pass filters - create regular expression... |
---|
151 | |
---|
152 | >>> reloc = re.compile("<loc>%s([^\<]*)</loc>" % self.portal.absolute_url(), re.S) |
---|
153 | |
---|
154 | With help of reloc regular expression - check if all 4 documents + default |
---|
155 | front-page present in the sitemap without filters. |
---|
156 | |
---|
157 | >>> no_filters_res = reloc.findall(no_filters_content) |
---|
158 | >>> no_filters_res.sort() |
---|
159 | >>> print "\n".join(no_filters_res) |
---|
160 | /Members/test_user_1_/doc1 |
---|
161 | /Members/test_user_1_/doc2 |
---|
162 | /doc1 |
---|
163 | /doc2 |
---|
164 | /front-page |
---|
165 | |
---|
166 | |
---|
167 | Check "id" filter |
---|
168 | ================= |
---|
169 | |
---|
170 | Go to the edit form of the sitemap and add "doc1" |
---|
171 | and "front-page" lines with "id:" prefix to the |
---|
172 | "Blackout entries" field. |
---|
173 | |
---|
174 | >>> browser.open(smedit_url) |
---|
175 | >>> filtercontrol = browser.getControl("Blackout entries") |
---|
176 | >>> filtercontrol.value = """ |
---|
177 | ... id:doc1 |
---|
178 | ... id:front-page |
---|
179 | ... """ |
---|
180 | >>> browser.getControl("Save").click() |
---|
181 | >>> id_filter_content = browser.contents |
---|
182 | |
---|
183 | As result - all "doc1" and "front-page" documents must be |
---|
184 | filtered-out from the sitemap. |
---|
185 | |
---|
186 | >>> id_filter_res = reloc.findall(id_filter_content) |
---|
187 | >>> id_filter_res.sort() |
---|
188 | >>> print "\n".join(id_filter_res) |
---|
189 | /Members/test_user_1_/doc2 |
---|
190 | /doc2 |
---|
191 | |
---|
192 | |
---|
193 | Check "path" filter |
---|
194 | =================== |
---|
195 | |
---|
196 | Suppouse we wont to filter-out doc2 of the test_user_1_'s (but |
---|
197 | not from the portal root) and the front-page from the portal root. |
---|
198 | |
---|
199 | >>> browser.open(smedit_url) |
---|
200 | >>> filtercontrol = browser.getControl("Blackout entries") |
---|
201 | >>> filtercontrol.value = """ |
---|
202 | ... path:/Members/test_user_1_/doc2 |
---|
203 | ... path:/front-page |
---|
204 | ... """ |
---|
205 | >>> browser.getControl("Save").click() |
---|
206 | >>> path_filter_content = browser.contents |
---|
207 | |
---|
208 | As result - "doc2" of the pointed member and "front-page" documents |
---|
209 | must be filtered-out from the sitemap. |
---|
210 | |
---|
211 | >>> path_filter_res = reloc.findall(path_filter_content) |
---|
212 | >>> path_filter_res.sort() |
---|
213 | >>> print "\n".join(path_filter_res) |
---|
214 | /Members/test_user_1_/doc1 |
---|
215 | /doc1 |
---|
216 | /doc2 |
---|
217 | |
---|
218 | |
---|
219 | Check default filter |
---|
220 | ==================== |
---|
221 | |
---|
222 | Lets check what filter should be used for old-feshion filters |
---|
223 | (without any filter name prefixes)? |
---|
224 | |
---|
225 | Go to the edit form of the sitemap and add "doc1" and front-page |
---|
226 | lines without any filter name prefix to the "Blackout entries" |
---|
227 | field. |
---|
228 | |
---|
229 | >>> browser.open(portal_url + "/sitemap.xml/edit") |
---|
230 | >>> filtercontrol = browser.getControl("Blackout entries") |
---|
231 | >>> filtercontrol.value = """ |
---|
232 | ... doc1 |
---|
233 | ... front-page |
---|
234 | ... """ |
---|
235 | >>> browser.getControl("Save").click() |
---|
236 | >>> default_filter_content = browser.contents |
---|
237 | |
---|
238 | By default "id" filter must be used, so all "doc1" and "front-page" |
---|
239 | objects must be filtered-out from the sitemap. |
---|
240 | |
---|
241 | >>> default_filter_res = reloc.findall(default_filter_content) |
---|
242 | >>> default_filter_res.sort() |
---|
243 | >>> print "\n".join(default_filter_res) |
---|
244 | /Members/test_user_1_/doc2 |
---|
245 | /doc2 |
---|
246 | |
---|
247 | |
---|
248 | Creation own filters |
---|
249 | ==================== |
---|
250 | |
---|
251 | Suppouse we want to create own blackout filter, |
---|
252 | which behave like id-filter, but with some differencies. |
---|
253 | Our fitler has following format: |
---|
254 | |
---|
255 | (+|-)<filtered id> |
---|
256 | |
---|
257 | - when 1st sign "+" then only objects with <filtered id> |
---|
258 | must leave after filetering, |
---|
259 | - if 1st sign is "-" or all objects with <filtered id> must be |
---|
260 | filtered-out (like default id filter) |
---|
261 | |
---|
262 | You need create new IBlckoutFilter multi-adapter, |
---|
263 | and register it with unique name. |
---|
264 | |
---|
265 | >>> from zope.component import adapts |
---|
266 | >>> from zope.interface import Interface, implements |
---|
267 | >>> from zope.publisher.interfaces.browser import IBrowserRequest |
---|
268 | >>> from quintagroup.plonegooglesitemaps.interfaces import IBlackoutFilter |
---|
269 | >>> class SignedIdFilter(object): |
---|
270 | ... adapts(Interface, IBrowserRequest) |
---|
271 | ... implements(IBlackoutFilter) |
---|
272 | ... def __init__(self, context, request): |
---|
273 | ... self.context = context |
---|
274 | ... self.request = request |
---|
275 | ... def filterOut(self, fdata, fargs): |
---|
276 | ... sign = fargs[0] |
---|
277 | ... fid = fargs[1:] |
---|
278 | ... if sign == "+": |
---|
279 | ... return [b for b in fdata if b.getId==fid] |
---|
280 | ... elif sign == "-": |
---|
281 | ... return [b for b in fdata if b.getId!=fid] |
---|
282 | ... return fdata |
---|
283 | |
---|
284 | |
---|
285 | Now register this new filter as named multiadapter ... |
---|
286 | |
---|
287 | >>> from zope.component import provideAdapter |
---|
288 | >>> provideAdapter(SignedIdFilter, |
---|
289 | ... name=u'signedid') |
---|
290 | |
---|
291 | So thet's all what neede to add new filter. |
---|
292 | No test if newly added filter really take into consideration. |
---|
293 | |
---|
294 | Check if white filtering ("+" prefix) work correctly. |
---|
295 | Go to the edit form of the sitemap and add "signedid:+doc1" |
---|
296 | to the "Blackout entries" field. |
---|
297 | |
---|
298 | >>> browser.open(smedit_url) |
---|
299 | >>> filtercontrol = browser.getControl("Blackout entries") |
---|
300 | >>> filtercontrol.value = """ |
---|
301 | ... signedid:+doc1 |
---|
302 | ... """ |
---|
303 | >>> browser.getControl("Save").click() |
---|
304 | >>> signedid_filter_content = browser.contents |
---|
305 | |
---|
306 | As result - only objects with "doc1" id must present in the sitemap. |
---|
307 | |
---|
308 | >>> signedid_filter_res = reloc.findall(signedid_filter_content) |
---|
309 | >>> signedid_filter_res.sort() |
---|
310 | >>> print "\n".join(signedid_filter_res) |
---|
311 | /Members/test_user_1_/doc1 |
---|
312 | /doc1 |
---|
313 | |
---|
314 | |
---|
315 | An for the last - check black filtering ("-" prefix) is working. |
---|
316 | Go to the edit form of the sitemap and add "signedid:-doc1" |
---|
317 | to the "Blackout entries" field. |
---|
318 | |
---|
319 | >>> browser.open(smedit_url) |
---|
320 | >>> filtercontrol = browser.getControl("Blackout entries") |
---|
321 | >>> filtercontrol.value = """ |
---|
322 | ... signedid:-doc1 |
---|
323 | ... """ |
---|
324 | >>> browser.getControl("Save").click() |
---|
325 | >>> signedid_filter_content = browser.contents |
---|
326 | |
---|
327 | As result - all except objects with "doc1" id must present in the sitemap. |
---|
328 | |
---|
329 | >>> signedid_filter_res = reloc.findall(signedid_filter_content) |
---|
330 | >>> signedid_filter_res.sort() |
---|
331 | >>> print "\n".join(signedid_filter_res) |
---|
332 | /Members/test_user_1_/doc2 |
---|
333 | /doc2 |
---|
334 | /front-page |
---|