中文亚洲精品无码_熟女乱子伦免费_人人超碰人人爱国产_亚洲熟妇女综合网

當(dāng)前位置: 首頁(yè) > news >正文

wordpress開(kāi)發(fā)視頻網(wǎng)站模板下載/免費(fèi)二級(jí)域名平臺(tái)

wordpress開(kāi)發(fā)視頻網(wǎng)站模板下載,免費(fèi)二級(jí)域名平臺(tái),做電焊加工的網(wǎng)站,聚合影視網(wǎng)站建設(shè)目錄 前言 問(wèn)題分析 page buffers創(chuàng)建 page buffers丟失 Write-Protect Dirty Page w/o Buffers 問(wèn)題解決 前言 這個(gè)問(wèn)題發(fā)生在3.10.0-514.el7上,并且在RHEL的知識(shí)庫(kù)中快速找到了對(duì)應(yīng)的案例以及解決方案,但是,理解問(wèn)題如何發(fā)生和解決…

目錄

前言

問(wèn)題分析

page buffers創(chuàng)建

page buffers丟失

Write-Protect

Dirty Page w/o Buffers

問(wèn)題解決


前言

這個(gè)問(wèn)題發(fā)生在3.10.0-514.el7上,并且在RHEL的知識(shí)庫(kù)中快速找到了對(duì)應(yīng)的案例以及解決方案,但是,理解問(wèn)題如何發(fā)生和解決則著實(shí)費(fèi)了些功夫。

RHEL的鏈接如下:

RHEL7: kernel crash in xfs_vm_writepage - kernel BUG at fs/xfs/xfs_aops.c:1062! - Red Hat Customer Portal

調(diào)用棧為:

[1004630.854317] kernel BUG at fs/xfs/xfs_aops.c:1062!
[1004630.854894] invalid opcode: 0000 [#1] SMP 
[1004630.861333] CPU: 6 PID: 56715 Comm: kworker/u48:4 Tainted: G        W      ------------   3.10.0-514.el7.x86_64 #1
[1004630.862046] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 12/28/2015
[1004630.862703] Workqueue: writeback bdi_writeback_workfn (flush-253:28)
[1004630.863414] task: ffff881f8436de20 ti: ffff881f23a4c000 task.ti: ffff881f23a4c000
[1004630.864117] RIP: 0010:[<ffffffffa083f2fb>]  [<ffffffffa083f2fb>] xfs_vm_writepage+0x58b/0x5d0 [xfs]
[1004630.864860] RSP: 0018:ffff881f23a4f948  EFLAGS: 00010246
[1004630.865749] RAX: 002fffff00040029 RBX: ffff881bedd50308 RCX: 000000000000000c
[1004630.866466] RDX: 0000000000000008 RSI: ffff881f23a4fc40 RDI: ffffea00296b7800
[1004630.867218] RBP: ffff881f23a4f9f0 R08: fffffffffffffffe R09: 000000000001a098
[1004630.867941] R10: ffff88207ffd6000 R11: 0000000000000000 R12: ffff881bedd50308
[1004630.868656] R13: ffff881f23a4fc40 R14: ffff881bedd501b8 R15: ffffea00296b7800
[1004630.869399] FS:  0000000000000000(0000) GS:ffff881fff180000(0000) knlGS:0000000000000000
[1004630.870147] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1004630.870868] CR2: 0000000000eb3d30 CR3: 0000001ff79dc000 CR4: 00000000001407e0
[1004630.871610] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[1004630.872349] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[1004630.873072] Stack:
[1004630.873749]  0000000000008000 ffff880070b03644 ffff881f23a4fc40 ffff881f23a4fa68
[1004630.874480]  ffff881f23a4fa80 ffffea00296b7800 0000000000001000 000000000000000e
[1004630.875223]  0000000000001000 ffffffff81180981 0000000000000000 ffff881bedd50310
[1004630.875957] Call Trace:
[1004630.876665]  [<ffffffff81180981>] ? find_get_pages_tag+0xe1/0x1a0
[1004630.877417]  [<ffffffff8118b3b3>] __writepage+0x13/0x50
[1004630.878173]  [<ffffffff8118bed1>] write_cache_pages+0x251/0x4d0
[1004630.878915]  [<ffffffffa00c170a>] ? enqueue_cmd_and_start_io+0x3a/0x40 [hpsa]
[1004630.879626]  [<ffffffff8118b3a0>] ? global_dirtyable_memory+0x70/0x70
[1004630.880368]  [<ffffffff8118c19d>] generic_writepages+0x4d/0x80
[1004630.881157]  [<ffffffffa083e063>] xfs_vm_writepages+0x53/0x90 [xfs]
[1004630.881907]  [<ffffffff8118d24e>] do_writepages+0x1e/0x40
[1004630.882643]  [<ffffffff81228730>] __writeback_single_inode+0x40/0x210
[1004630.883403]  [<ffffffff8122941e>] writeback_sb_inodes+0x25e/0x420
[1004630.884141]  [<ffffffff8122967f>] __writeback_inodes_wb+0x9f/0xd0
[1004630.884863]  [<ffffffff81229ec3>] wb_writeback+0x263/0x2f0   
[1004630.885610]  [<ffffffff810ab776>] ? set_worker_desc+0x86/0xb0
[1004630.886378]  [<ffffffff8122bd05>] bdi_writeback_workfn+0x115/0x460
[1004630.887142]  [<ffffffff810c4cf8>] ? try_to_wake_up+0x1c8/0x330
[1004630.887875]  [<ffffffff810a7f3b>] process_one_work+0x17b/0x470
[1004630.888638]  [<ffffffff810a8d76>] worker_thread+0x126/0x410   
[1004630.889389]  [<ffffffff810a8c50>] ? rescuer_thread+0x460/0x460
[1004630.890126]  [<ffffffff810b052f>] kthread+0xcf/0xe0
[1004630.890816]  [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
[1004630.891521]  [<ffffffff81696418>] ret_from_fork+0x58/0x90
[1004630.892229]  [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140
[1004630.892877] Code: e0 80 3d 4d b4 06 00 00 0f 85 a4 fe ff ff be d7 03 00 00 48 c7 c7 4a e0 88 a0 e8 61 66 84 e0 c6 05 2f b4 06 00 01 e9 87 fe ff ff <0f> 0b 8b 4d a4 e9 e8 fb ff ff 41 b9 01 00 00 00 e9 69 fd ff ff 
[1004630.894245] RIP  [<ffffffffa083f2fb>] xfs_vm_writepage+0x58b/0x5d0 [xfs]
[1004630.894890]  RSP <ffff881f23a4f94

問(wèn)題發(fā)生的位置為:

xfs_vm_writepage()
---...bh = head = page_buffers(page);...
---#define page_buffers(page)                    \({                            \BUG_ON(!PagePrivate(page));            \((struct buffer_head *)page_private(page));    \})

問(wèn)題分析

page buffers創(chuàng)建

在Linux內(nèi)核中,buffer head的目的是為了對(duì)接在存儲(chǔ)子系統(tǒng)和內(nèi)存子系統(tǒng)的兩個(gè)基本單位:

  • sector,這是存儲(chǔ)的基本單位,512字節(jié)
  • page,這個(gè)是內(nèi)存的基本單位,4096字節(jié)

文件緩存,即page cache,是存儲(chǔ)子系統(tǒng)和內(nèi)存子系統(tǒng)的結(jié)合部,buffer_head對(duì)應(yīng)的就是page cache中的一個(gè)sector;當(dāng)我們格式化文件系統(tǒng),把fsblock設(shè)置為512、1024、2048字節(jié)時(shí),或者操作raw block設(shè)備時(shí),每個(gè)page就會(huì)對(duì)應(yīng)多個(gè)buffer_head;

近些年,512字節(jié)的fsblock基本被主流文件系統(tǒng)拋棄,雖然還支持,但是都以支持4K為主要的優(yōu)化方向,xfs甚至拋棄了buffer_head,直接使用page作為IO操作的基本單位;

那么,buffer_head都是在哪些契機(jī)被創(chuàng)建呢?

3.10.0-514.el7
【A】
generic_perform_write()-> aops->write_begin()xfs_vm_write_begin()-> grab_cache_page_write_begin()-> __block_write_begin()-> create_page_buffers()
【B】
do_shared_fault()-> __do_fault()-> vma->vm_ops->page_mkwrite()xfs_filemap_page_mkwrite()-> __block_page_mkwrite()-> __block_write_begin()-> create_page_buffers()do_mpage_readpage() -> block_read_full_page() //如果page中各個(gè)bh的狀態(tài)不一致,不如有些map有些unmap,會(huì)進(jìn)入到此路徑,分別對(duì)bh進(jìn)行操作-> create_page_buffers()

通常來(lái)講【A】和【B】路徑可以保證,在對(duì)page做寫(xiě)操作之前,保證page buffers已經(jīng)創(chuàng)建,它們分別代表是通過(guò)系統(tǒng)調(diào)用和mmap對(duì)文件進(jìn)行寫(xiě)的場(chǎng)景;

page buffers丟失

那本問(wèn)題中,被設(shè)置了dirty標(biāo)記的page的buffer是如何丟失的呢?

page buffer被釋放的典型場(chǎng)景:

3.10.0-514.el7shrink_page_list()-> try_to_unmap()-> try_to_release_page()-> aops->releasepage()xfs_vm_releasepage()-> try_to_free_buffers()-> __remove_mapping()

在try_to_release_page()之前,try_to_unmap()會(huì)被調(diào)用,它會(huì)清理掉pte,并且檢測(cè)是否需要page dirty

3.10.0-514.el7try_to_unmap()-> try_to_unmap_file()-> try_to_unmap_one()-> set_page_dirty() //pte_dirty()

這樣就可以保證,在執(zhí)行try_to_release_page()之前,給page及其buffer設(shè)置dirty 標(biāo)記,

3.10.0-514.el7xfs_vm_set_page_dirty()
---spin_lock(&mapping->private_lock);if (page_has_buffers(page)) {struct buffer_head *head = page_buffers(page);struct buffer_head *bh = head;do {if (offset < end_offset)set_buffer_dirty(bh);bh = bh->b_this_page;offset += 1 << inode->i_blkbits;} while (bh != head);}
---try_to_free_buffers()-> drop_buffers()-> buffer_busy()-> atomic_read(&bh->b_count) | (bh->b_state & ((1 << BH_Dirty) | (1 << BH_Lock)))

如果buffer有dirty標(biāo)記就不會(huì)被釋放。另外,有truncate和invalidate的場(chǎng)景,也是類(lèi)似的操作。

但是有一個(gè)機(jī)器特殊的場(chǎng)景:

3.10.0-514.el7shrink_active_list()
---if (unlikely(buffer_heads_over_limit)) {if (page_has_private(page) && trylock_page(page)) {if (page_has_private(page))try_to_release_page(page, 0);unlock_page(page);}}
---

這里對(duì)buffers直接進(jìn)行釋放。

Write-Protect

用對(duì)mmap的page的寫(xiě)操作,是通過(guò)下面的機(jī)器觸發(fā)的page fault并給page設(shè)置dirty的

3.10.0-514.el7generic_writepages()-> clear_page_dirty_for_io()-> page_mkclean()-> page_mkclean_file()-> page_mkclean_one()-> pte_wrprotect()-> pte_mkclean()handle_pte_fault()-> do_wp_page() // pte_present() && !pte_write()-> wp_page_shared()-> do_page_mkwrite()-> xfs_filemap_page_mkwrite()-> __block_page_mkwrite()-> lock_page()-> __block_write_begin()-> block_commit_write()-> set_page_dirty(page);-> wait_for_stable_page(page);-> wp_page_reuse()-> set_page_dirty()-> unlock_page()

在執(zhí)行writepage之前,在page_lock的保護(hù)之下,通過(guò)clean_page_dirty_for_io()清除page的dirty? flags以及mmap的pte的寫(xiě)權(quán)限,將相關(guān)page變?yōu)閣rite-protect,這樣,下次用戶寫(xiě)這個(gè)page的時(shí)候,就會(huì)觸發(fā)pagefault,內(nèi)核在這里將相關(guān)的page設(shè)置為dirty,在此期間,會(huì)給page創(chuàng)建buffers;這樣,就可以保證任何對(duì)mmap的寫(xiě)操作,都可以通過(guò)page fault提交到writeback子系統(tǒng)中。

write-protect page fault發(fā)生時(shí),寫(xiě)操作還沒(méi)有發(fā)生,所以,dirty bit并不會(huì)被設(shè)置;而一旦寫(xiě)操作發(fā)生,那么上面的代碼所代表的過(guò)程,一定會(huì)發(fā)生,那么buffer也一定是具備的。

Dirty Page w/o Buffers

這里我們對(duì)比下ext4和xfs的writepages操作的調(diào)用棧:

ext4_writepages()-> mpage_prepare_extent_to_map()-> pagevec_lookup_tag()-> lock_page()-> wait_on_page_writeback()-> mpage_process_page_bufs()-> mpage_submit_page()-> clear_page_dirty_for_io(page)-> ext4_bio_write_page()-> set_page_writeback()-> io_submit_add_bh()-> clean_buffer_dirty()-> unlock_page()xfs_vm_writepages()-> generic_writepages()-> write_cache_pages()-> pagevec_lookup_tag()-> lock_page()-> xfs_vm_writepage()-> lock_buffer()-> xfs_add_to_ioend()-> xfs_start_page_writeback()-> clear_page_dirty_for_io()-> set_page_writeback()-> unlock_page() <<-----------------------HERE!!!!                    -> xfs_submit_ioend()-> xfs_start_page_writeback()-> mark_buffer_async_write()-> set_buffer_uptodate()-> clear_buffer_dirty()

在ext4調(diào)用棧中,Page Dirty和Buffer Dirty的清理都是在page_lock下進(jìn)行的;而xfs中,buffer的清理是在page_lock之外,這時(shí),我們?nèi)绻雙age fault過(guò)程中的page_mkwrite調(diào)用鏈,就會(huì)產(chǎn)生以下競(jìng)態(tài),

writeback workqueue             user page fault()
xfs_vm_writepages()             xfs_filemap_page_mkwrite()
lock_page()                     __block_page_mkwrite()
clear_page_dirty_for_io()
unlock_page()lock_page()xfs_vm_set_page_dirty()set_buffer_dirty()TestSetPageDirty()
clear_buffer_dirty()end_page_writeback()

于是這里,我們得到了一個(gè)page有dirty flags,但是buffer全是clean的;如果將此場(chǎng)景帶入到ext4,就不會(huì)有這種問(wèn)題,因?yàn)橛衟age_lock的保護(hù),最終的結(jié)果,要么是page buffer全部dirty,要是全是clean。

到這一步,產(chǎn)生了兩個(gè)關(guān)鍵點(diǎn):

  • page dirty + buffer clean
  • dirty bit,因?yàn)閣rite-protect page fault已經(jīng)發(fā)生過(guò),所以,寫(xiě)操作已經(jīng)完成

其中page dirty + buffer clean將繼續(xù)推進(jìn)問(wèn)題的發(fā)生;

我們?cè)倩氐絪hrink_active_list(),它可能會(huì)調(diào)用try_to_release_page(),

try_to_free_buffers()-> drop_buffers()-> buffer_busy() // dirty or lock-> cancel_dirty_page()

page dirty + buffer clean,在這里,因?yàn)閎uffer是clean的,所以,它可以被釋放,然后page的dirty也被清除了;

但是,此時(shí)pte中的dirty bit是存在的,于是在后續(xù)的shrink_page_list()中:

tshrink_page_list()-> try_to_unmap()-> try_to_unmap_file()-> try_to_unmap_one()-> set_page_dirty() //pte_dirty()

page被設(shè)置dirty,然后回收中止;得到了一個(gè)page dirty + no buffers

所以,問(wèn)題的關(guān)鍵是,xfs_vm_writepage()中,對(duì)page dirty和buffer dirty的clean操作并沒(méi)有在page_lock的保護(hù)下。

問(wèn)題解決

在搜索社區(qū)代碼和Commit記錄之后,該問(wèn)題在以下commit解決:

commit e10de3723c53378e7cf441529f563c316fdc0dd3
Author: Dave Chinner <dchinner@redhat.com>
Date:   Mon Feb 15 17:23:12 2016 +1100xfs: don't chain ioends during writepage submission@@ -565,6 +539,7 @@ xfs_add_to_ioend(bh->b_private = NULL;wpc->ioend->io_size += bh->b_size;wpc->last_block = bh->b_blocknr;
+       xfs_start_buffer_writeback(bh);}

在該修改之后,buffer clean操作也放到了page lock之下。

http://www.risenshineclean.com/news/526.html

相關(guān)文章:

  • 想做一個(gè)自己設(shè)計(jì)公司的網(wǎng)站怎么做/網(wǎng)絡(luò)外包運(yùn)營(yíng)公司
  • 做鮮花配送網(wǎng)站需要準(zhǔn)備什么/營(yíng)銷(xiāo)推廣活動(dòng)策劃方案
  • b2c 網(wǎng)站做seo優(yōu)化/蘋(píng)果看國(guó)外新聞的app
  • 徐州住房和城鄉(xiāng)建設(shè)局網(wǎng)站/互聯(lián)網(wǎng)營(yíng)銷(xiāo)師證書(shū)怎么考
  • 酒店加盟什么網(wǎng)站建設(shè)/百度客服聯(lián)系方式
  • 好的免費(fèi)移動(dòng)網(wǎng)站建設(shè)平臺(tái)有哪些/安慶seo
  • 樂(lè)清網(wǎng)站推廣公司/seo關(guān)鍵詞排名優(yōu)化怎樣
  • 如何開(kāi)辦網(wǎng)站/東莞網(wǎng)站推廣策劃
  • 如何做搜索網(wǎng)站/seo外包公司哪家好
  • 大學(xué)社交網(wǎng)站建設(shè)日程表/品牌運(yùn)營(yíng)推廣方案
  • 青島 公司 網(wǎng)站建設(shè)價(jià)格/網(wǎng)絡(luò)營(yíng)銷(xiāo)推廣方案范文
  • 青島隊(duì)建網(wǎng)站/seo優(yōu)化褲子關(guān)鍵詞
  • 建設(shè)網(wǎng)站那些公司靠譜/百度網(wǎng)盟推廣
  • 訂做網(wǎng)站/四川網(wǎng)站seo
  • 做網(wǎng)站怎么租個(gè)空間/違禁網(wǎng)站用什么瀏覽器
  • 行業(yè)垂直網(wǎng)站開(kāi)發(fā)/網(wǎng)絡(luò)推廣seo教程
  • 做影視網(wǎng)站用主機(jī)還是用服務(wù)器/semseo是什么意思
  • 蘇州網(wǎng)站開(kāi)發(fā)公司招聘/搜索網(wǎng)站有哪幾個(gè)
  • 企業(yè)網(wǎng)站制作 深圳/免費(fèi)發(fā)布廣告的平臺(tái)
  • 有.net源碼如何做網(wǎng)站/網(wǎng)絡(luò)優(yōu)化培訓(xùn)要多少錢(qián)
  • 做動(dòng)態(tài)的網(wǎng)站的參考資料有哪些/seo排名教程
  • 天津b2b網(wǎng)站建設(shè)公司哪家好/化工seo顧問(wèn)
  • 做網(wǎng)站需要交印花稅/上海優(yōu)化營(yíng)商環(huán)境
  • 網(wǎng)站建設(shè)屬于哪個(gè)專(zhuān)業(yè)/太原百度快速優(yōu)化排名
  • 駿域網(wǎng)站建設(shè)專(zhuān)家/seo排名資源
  • 國(guó)際外貿(mào)網(wǎng)站建設(shè)/公司網(wǎng)站模版
  • 自己做交友網(wǎng)站/愛(ài)站權(quán)重查詢
  • 備案時(shí)的網(wǎng)站名稱(chēng)/百度推廣賬號(hào)注冊(cè)
  • wordpress 引入css/seo培訓(xùn)師
  • 保定市做網(wǎng)站公司地址電話/抖音推廣平臺(tái)聯(lián)系方式