Nova 源码分析
2021SC@SDUSC
Nova 源碼分析
一. Nova是什么
? Nova是openstack提供計算實例的一種方式(又名虛擬服務器)。Nova支持創建虛擬機,并對系統容器有有限的支持。盡管Linux擁有守護進程,Nova依舊提供了作為守護進程的服務。
? Nova與以下openstack服務共同組成基本服務:
[外鏈圖片轉存失敗,源站可能有防盜鏈機制,建議將圖片保存下來直接上傳(img-MXpuTnaF-1633804453865)(https://docs.openstack.org/nova/xena/_images/architecture.svg)]
- DB: sql database for data storage.
- API: component that receives HTTP requests, converts commands and communicates with other components via the oslo.messaging queue or HTTP.
- Scheduler: decides which host gets each instance.
- Compute: manages communication with hypervisor and virtual machines.
- Conductor: handles requests that need coordination (build/resize), acts as a database proxy, or handles object conversions.
- Placement: tracks resource provider inventories and usages.
對于端使用者來說,可以直接通過Horizon,Openstack Client或者Nova Client等方式直接使用API創建和管理服務器。
Nova可以設置成通過RPC發出通告
對于開發者來說,oepnstack提供了相當豐富的guide和reference可以學習。
當用戶發起一個新的請求時,該請求會先在 nova-api 中處理。nova-api 會對請求進行一系列檢查,包括請求是否合法,配額是否足夠等;當檢查用過后,nova-api 就會為該請求分配一個唯一的虛擬機 ID ,并在數據庫中新建對應的項來記錄虛擬機的狀態;然后,nova-api 會將請求發送給 nova-conductor 處理。
nova-conductor 主要管理服務之間的通信并進行任務處理。它在接收到請求之后,會為 nova-scheduler 創建一個 RequestSpec 對象用來包裝與調度相關的所有請求信息,然后調用 nova-scheduler 服務的 select_destination 接口。
nova-scheduler 通過接收到的 RequestSpec 對象,首先將 RequestSpec 對象轉換成 ResourceRequest 對象,并將該對象發送給 Placement 進行一次預篩選,然后會根據數據庫中最新的系統狀態做出調度決定,并告訴 nova-conductor 把該請求調度到合適的計算節點上。
nova-conductor 在得知調度決定后,會把請求發送給對應的 nova-compute 服務。
每個 nova-compute 服務都有獨立的資源監視器(Resource Tracker)用來監視本地主機的資源使用情況。當計算節點接收到請求時,資源監視器能夠檢查主機是否有足夠的資源。
-
若資源充足則啟動指定虛擬機,并在數據庫中更新虛擬機狀態同時將最新的主機資源情況更新到數據庫。
-
若當前主機不符合請求的資源要求,nova-compute拒絕要求并將請求重發給noca-conduct,重試整個調度過程。
二. Components
folders:
? 1.api:接受調用服務
cmd:Nova的各種服務入口
compute:創建和終止虛擬機的守護進程,管理虛擬機管理程序和虛擬機的通信。
conf:配置選項
conduct:處理需要協調的請求,是api,scheduler和compute的中介。
console:console服務
db:封裝數據庫服務
hacking:編碼規范檢查
image:封裝鏡像操作
keymgr:密鑰管理器
locale:國際化相關
network:網絡服務
notification:通知
object:避免直接操作數據庫,封裝操作
pci:PCI/SR-IOV支持
scheduler:scheduler服務
servicegroup:成員服務,服務組
storgage:CEPH存儲
tests:單元測試
virt:支持的 hypervisor 驅動
volume:封裝卷服務,Cinder接口抽象
Novncproxy:協調compute服務和數據庫之間的交互,是compute和數據庫交互的代
files:
__init__.pyavailability_zones.py # 區域設置的工具函數baserpc.py # 基礎 RPC 客戶端/服務端實現block_device.py # 塊設備映射cache_utils.py # oslo_cache 封裝config.py # 解析命令行參數context.py # 貫穿 Nova 的所有請求的上下文crypto.py # 包裝標準加密數據元素debugger.py # pydev 調試exception.py # 基礎異常類exception_wrapper.py # 封裝異常類filters.py # 基礎過濾器i18n.py # 集成 oslo_i18nloadables.py # 可加載類manager.py # 基礎 Manager 類middleware.py # 更新 oslo_middleware 的默認配置選項monkey_patch.py # eventlet 猴子補丁policy.py # 策略引擎profiler.py # 調用 OSProfilerquota.py # 每個項目的資源配額rpc.py # RPC 操作相關的工具函數safe_utils.py # 不會導致循環導入的工具函數service.py # 通用節點基類,用于在主機上運行的所有工作者service_auth.py # 身份認證插件test.py # 單元測試基礎類utils.py # 工具函數version.py # 版本號管理weights.py # 權重插件wsgi.py # 管理 WSGI 應用的服務器類三. 相關服務構造簡介
conduct
api.py對RPC接口封裝
rpcapi.py提供RPC接口
manager.py處理RPC API調用
compute訪問數據庫的操作全部要通過conduct代理完成,conduct操作object,一個object對應一個表
scheduler
filter提供過濾器實現,過濾不符合條件的主機
weights提供權重實現,用于計算權重并排序
四.Nova創建虛擬機試分析
? 首先,調用compute中service.py的create方法,通過它調用compute_api的__create_instance方法。
def create(*self*, *context*, *instance_type*,*image_href*, *kernel_id*=None, *ramdisk_id*=None,*min_count*=None, *max_count*=None,*display_name*=None, *display_description*=None,*key_name*=None, *key_data*=None, *security_groups*=None,*availability_zone*=None, *forced_host*=None, *forced_node*=None,*user_data*=None, *metadata*=None, *injected_files*=None,*admin_password*=None, *block_device_mapping*=None,*access_ip_v4*=None, *access_ip_v6*=None, *requested_networks*=None,*config_drive*=None, *auto_disk_config*=None, scheduler_hints*=None,*legacy_bdm*=True, *shutdown_terminate*=False,*check_server_group_quota*=False, *tags*=None,*supports_multiattach*=False, *trusted_certs*=None,*supports_port_resource_request*=False,*requested_host*=None, *requested_hypervisor_hostname*=None):"""Provision instances, sending instance information to thescheduler. The scheduler will determine where the instance(s)go and will handle creating the DB entries.Returns a tuple of (instances, reservation_id)"""*if* requested_networks and max_count is not None and max_count > 1:self._check_multiple_instances_with_specified_ip(requested_networks)self._check_multiple_instances_with_neutron_ports(requested_networks)*if* availability_zone:available_zones = availability_zones.\get_availability_zones(context.elevated(), self.host_api,*get_only_available*=True)*if* forced_host is None and availability_zone not in \available_zones:msg = _('The requested availability zone is not available')*raise* exception.InvalidRequest(msg)filter_properties = scheduler_utils.build_filter_properties(scheduler_hints, forced_host, forced_node, instance_type)*return* self._create_instance(context, instance_type,image_href, kernel_id, ramdisk_id,min_count, max_count,display_name, display_description,key_name, key_data, security_groups,availability_zone, user_data, metadata,injected_files, admin_password,access_ip_v4, access_ip_v6,requested_networks, config_drive,block_device_mapping, auto_disk_config,*filter_properties*=filter_properties,*legacy_bdm*=legacy_bdm,*shutdown_terminate*=shutdown_terminate,*check_server_group_quota*=check_server_group_quota,*tags*=tags, *supports_multiattach*=supports_multiattach,*trusted_certs*=trusted_certs,*supports_port_resource_request*=supports_port_resource_request,*requested_host*=requested_host,*requested_hypervisor_hostname*=requested_hypervisor_hostname)_create_instance方法調用了compute_task_api的schedule_and_build_instances方法,即conduct的api中的schedule_and_build_instances方法,它直接調用了conduct的compute的rpcapi中的schedule_and_build_instances方法。
def schedule_and_build_instances(self, context, build_requests,request_specs,image, admin_password, injected_files,requested_networks,block_device_mapping,tags=None):version = '1.17'kw = {'build_requests': build_requests,'request_specs': request_specs,'image': jsonutils.to_primitive(image),'admin_password': admin_password,'injected_files': injected_files,'requested_networks': requested_networks,'block_device_mapping': block_device_mapping,'tags': tags}if not self.client.can_send_version(version):version = '1.16'del kw['tags']cctxt = self.client.prepare(version=version)cctxt.cast(context, 'schedule_and_build_instances', **kw)cast是RPC調用schedule_and_build_instances方法,是異步調用,會立即返回
截至到現在,雖然目錄由api->compute->conductor,但仍在nova-api進程中運行,直到cast方法執行,該方法由于是異步調用,會立即返回,不會等待RPC返回,因此nova-api任務完成,此時會響應用戶請求,虛擬機狀態為building。
之后,請求通過oslo message傳遞給conduct的manager.py,調用schedule_and_build_instances方法,它首先調用_schedule_instance的select_destinations方法
def schedule_and_build_instances(self, context, build_requests,request_specs, image,admin_password, injected_files,requested_networks, block_device_mapping,tags=None):# Add all the UUIDs for the instancesinstance_uuids = [spec.instance_uuid for spec in request_specs]try:host_lists = self._schedule_instances(context, request_specs[0],instance_uuids, return_alternates=True)except Exception as exc:LOG.exception('Failed to schedule instances')self._bury_in_cell0(context, request_specs[0], exc,build_requests=build_requests,block_device_mapping=block_device_mapping,tags=tags)returnscheduler_client和compute_api以及compute_task_api都是一樣對服務的client封裝調用,不過scheduler沒有api.py模塊,而是有個單獨的client目錄,實現在nova/scheduler/client目錄的query.py模塊,select_destinations方法又很直接的調用了scheduler_rpcapi的select_destinations方法,終于又到了RPC調用環節。
RPC封裝在scheduler的rpcapi.py中實現。
cctxt = self.client.prepare(version=version, call_monitor_timeout=CONF.rpc_response_timeout,timeout=CONF.long_rpc_timeout)return cctxt.call(ctxt, 'select_destinations', **msg_args)call方法時RPC的同步方法,conduct會一直等待scheduler返回,此時scheduler接管任務。
rpcapi調用manager.py的對應select_destination方法,這個方法又調用了driver的select_destination方法。這里的driver其實就是調度驅動,在配置文件中scheduler配置組指定,默認為filter_scheduler,對應nova/scheduler/filter_scheduler.py模塊,該算法根據指定的filters過濾掉不滿足條件的計算節點,然后通過weigh方法計算權值,最后選擇權值高的作為候選計算節點返回。
最后nova-scheduler返回調度的hosts集合,任務結束。由于nova-conductor通過同步方法調用的該方法,因此nova-scheduler會把結果返回給nova-conductor服務。
conduct等待scheduler返回后回到manager.py的scheduler_and_build_instance方法。
之后調用compute_rpcapi的build_and_run_instance
with obj_target_cell(instance, cell) as cctxt:self.compute_rpcapi.build_and_run_instance(cctxt, instance=instance, image=image,request_spec=request_spec,filter_properties=filter_props,admin_password=admin_password,injected_files=injected_files,requested_networks=requested_networks,security_groups=legacy_secgroups,block_device_mapping=instance_bdms,host=host.service_host, node=host.nodename,limits=host.limits, host_list=host_list,accel_uuids=accel_uuids)同理,rpcapi異步調用compute的同名方法,compute接管任務。
來到compute的manager.py,找到build_and_run_instance方法。
def build_and_run_instance(self, context, instance, image, request_spec,filter_properties, accel_uuids, admin_password=None,injected_files=None, requested_networks=None,security_groups=None, block_device_mapping=None,node=None, limits=None, host_list=None):@utils.synchronized(instance.uuid)def _locked_do_build_and_run_instance(*args, **kwargs):# NOTE(danms): We grab the semaphore with the instance uuid# locked because we could wait in line to build this instance# for a while and we want to make sure that nothing else tries# to do anything with this instance while we wait.with self._build_semaphore:try:result = self._do_build_and_run_instance(*args, **kwargs)except Exception:# NOTE(mriedem): This should really only happen if# _decode_files in _do_build_and_run_instance fails, and# that's before a guest is spawned so it's OK to remove# allocations for the instance for this node from Placement# below as there is no guest consuming resources anyway.# The _decode_files case could be handled more specifically# but that's left for another day.result = build_results.FAILEDraisefinally:if result == build_results.FAILED:# Remove the allocation records from Placement for the# instance if the build failed. The instance.host is# likely set to None in _do_build_and_run_instance# which means if the user deletes the instance, it# will be deleted in the API, not the compute service.# Setting the instance.host to None in# _do_build_and_run_instance means that the# ResourceTracker will no longer consider this instance# to be claiming resources against it, so we want to# reflect that same thing in Placement. No need to# call this for a reschedule, as the allocations will# have already been removed in# self._do_build_and_run_instance().self.reportclient.delete_allocation_for_instance(context, instance.uuid)if result in (build_results.FAILED,build_results.RESCHEDULED):self._build_failed(node)else:self._build_succeeded(node)# NOTE(danms): We spawn here to return the RPC worker thread back to# the pool. Since what follows could take a really long time, we don't# want to tie up RPC workers.utils.spawn_n(_locked_do_build_and_run_instance,context, instance, image, request_spec,filter_properties, admin_password, injected_files,requested_networks, security_groups,block_device_mapping, node, limits, host_list,accel_uuids)這里的driver就是compute driver,通過compute配置組的compute_driver指定,這里為libvirt.LibvirtDriver,代碼位于nova/virt/libvirt/driver.py,找到spawn()方法,該方法調用Libvirt創建虛擬機,并等待虛擬機狀態為Active,nova-compute服務結束,整個創建虛擬機流程也到此結束。
五.總結
n_password, injected_files,
requested_networks, security_groups,
block_device_mapping, node, limits, host_list,
accel_uuids)
這里的driver就是compute driver,通過compute配置組的compute_driver指定,這里為libvirt.LibvirtDriver,代碼位于nova/virt/libvirt/driver.py,找到spawn()方法,該方法調用Libvirt創建虛擬機,并等待虛擬機狀態為Active`,nova-compute服務結束,整個創建虛擬機流程也到此結束。
? Nova的架構和工作模式還有更多的內容可以挖掘,規律是不同層次通過RPC交流,調用manager中的實現方法,但是具體的策略待進一步探索。
總結
- 上一篇: 微信引流方法之闲鱼引流
- 下一篇: 适合各种创业者借鉴的案例,老板创业2个小