הכנת VM מבוסס לינוקס לשימוש אצל ספקי ענן

חברות רבות משתמשות כיום בשרותיהן של ספקי ענן (אמזון, גוגל, Azure, Rack Space, Digital-Ocean ועוד) ובמקרים רבים אנשים מקימים לעצמם את השרתים בשיטה הקלאסית: בוחרים מערכת הפעלה מהתפריט שהספק מציע (או משתמשים ב-Image שהספק מציע), ולאחר מכן הם מבצעים כניסת SSH, ומשם הם ממשיכים להתקין חבילות, לבצע הגדרות, להעלות סקריפטים, להוסיף משתמשים וכו' וכו'.

שיטה זו היא שיטה מעולה – אם כל מה שיש לך זה שרת יחיד או כמות Fixed של שרתי VM. אחרי הכל, חברות רבות מעדיפות להקים מספר קבוע של X שרתים ועם זה הם יתמודדו, יגדירו Flow וכו'.

אך כשחברה, בין אם היא סטארט-אפ קטן או חברת שרותים טכנולוגיים גדולה – מצפה לקבל מיליוני כניסות, לא מומלץ לעבוד בשיטה הזו. הסיבה לכך היא שכשיש כמות גולשים רבה שנכנסת, משתמשים בשרותים שונים שנותנים Scale-Up, כלומר במקרה שקריטריון שהוגדר מראש מתממש – המערכת תרים שרת נוסף ותפנה אליו גולשים ואם יהיה צורך, המערכת תקים עוד ועוד שרתים ככל שהעומס דורש, ולאחר מכן שהעומס נרגע, היא "תהרוג" את רוב השרתים עד שנגיע למצב התחלתי עם מספר שרתים קטן.

הבעיה בהקמת שרת נוסף היא הזמן שלוקח לשרת כזה "לקום". הבלוג של חברת Flycops נותן דוגמא מצוינת לכך. במקרה שלהם, כל שרת חדש שהיה מוקם במסגרת ה-Scale Up לקח לו לא פחות מ-6 דקות עד שהוא היה מסוגל לקבל גולשים. זה אולי נשמע זמן קצר, אבל אלו 6 דקות שאתם כחברה תפסידו גולשים שמגיעים מכל מיני מקומות שונים (גוגל, בלוגים, אתרים שמפנים אליכם, לינקים מאימיילים וכו') וחבל.

לכן, במקרים של Scale Up שהמערכת שתשתמשו תרים עוד ועוד שרתים בהתאם לקריטריונים של עומס – כדאי לתכנן מראש Image שתבנו שהוא יעלה, שימשוך הגדרות מסויימות ושיהיה זמין לקבל גולשים.

איך עושים זאת? די פשוט:

בשלב ראשון נשתמש במערכת וירטואליזציה שיש לנו מקומית. זה יכול להיות ESXi, זה יכול להיות VMWare לדסקטופ, זה יכול להיות VirtualBox או יכול להיות (ומה שהח"מ משתמש) KVM.
נקים Guest חדש ונשתמש ב-ISO של הפצת הלינוקס המועדפת עלינו. מבחינת גודל דיסק, לא מומלץ "להשתולל" (במיוחד לאלו שאינם יכולים להקים מכונה עם Thin Provisioning) – ברוב המקרים 8-10 ג'יגה אמורים להספיק בהחלט. מבחינת Partitions, כל אחד יכול להחליט באיזה שיטה ללכת, עם או בלי LVM. אני ממליץ לבצע Partition יחיד (flat) שהכל ישב שם. חשוב: מבחינת חבילות לא מומלץ להתקין GUI גרפי, זה סתם יתפוס מקום ומשאבים.
לאחר שסיימנו עם ההתקנה נפעיל את המכונה הוירטואלית, נתחבר אליה (ב-SSH) ונוודא שיש לה חיבור לאינטרנט.
בשלב הבא אנחנו צריכים להתקין את האפליקציות שאנחנו צריכים שיהיו ב-VM. אני ממליץ לבחור באחת מהפתרונות הבאים:
- יש את Packer (שכתובה ב-Go – תודה לעמוס על התיקון) שאיתה אפשר לבנות את כל ההתקנה שאתם צריכים על ה-VM. היא מתאימה מאוד לחובבי JSON.
- יש את Cloud-Init שכתבו קנוניקל ורד-האט "אימצה" בשמחה. היתרון שלו שהוא הרבה יותר ידידותי לאנשי סיסטם שלא מעוניינים להתעסק יותר מדי "בקרביים". עם Cloud-init מגדירים מה המשתמשים שיהיו, מה החבילות שצריך, וב-reboot הבא המערכת כבר תעשה את הכל לבד.
  שימו לב: את Cloud-init יש להתקין בתוך ה-VM. מכיוון שהוא נמצא ב-REPO של EPEL, יש לבצע yum install epel-release (לא צריך את ה-URL עם הגירסה האחרונה אם אתם משתמשים ב-CentOS, זה אוטומטי), ולאחר מכן yum install cloud-init.
- אפשרי לעבוד עם Puppet – כל עוד אתה יודע לעבוד ללא Puppet Master.
- חשוב מאוד – בצעו update לאחר שהתקנתם את מה שרציתם. המכונות שיבוססו על ה-image הזה ישרתו אנשים מבחוץ ולא נעים לחטוף פריצה רק בגלל ששכחנו לעדכן את כל ה-DEB/RPM.

לאחר שבחרתם את הפתרון לעיל ויישמתם את כל מה שרציתם ב-VM, הגיע הזמן להכין אותו לעבודה אצל הספק. את ההוראות הבאות תצטרכו לבצע דרך מכונת לינוקס:

כבו את המכונה הוירטואלית וגשו למחיצה שבה היא נמצאת.
התקינו את חבילת libguestfs בהתאם להפצת לינוקס שאתם משתמשים בה (מחוץ ל-VM)
מכיוון שיכול להיות שהמכונה כוללת דברים שאין לנו צורך בהם (מפתחות שונים שהשתמשנו כדי להעתיק ממקומות אחרים, תעודות SSL, קבצי מטמון וכו') נשתמש בפקודה virt-sysprep כדי לנקות את ה-Image. הריצו את הפקודה virt-sysprep -a image.vmdk (כאשר image.vmdk זהו שם ה-image של ה-VM שלכם). פעולת ה-virt-sysprep תנקה את כל מה שלא צריך וגם תמחק את כל ה-MAC Address שיש לכרטיסי רשת.

לפני שאנחנו מעלים את ה-image לענן, חשוב לבדוק שאתם מגדירים partitions ודברים נוספים (kernel modules, הגדרות שונות) לפי מה שהספק ענן שלכם מבקש, וכל ספק עם השטיקים שלו.

אם אתם משתמשים ב-Ravello (כדי לבצע testing, PoC):
אנחנו צריכים להקטין את ה-image לגודל קטן (מכיוון שהתקנות יוצרות קבצים זמניים שנמחקים, ה-image בעקרון אינו קטן בצורה אוטומטית). לשם כך נשתמש בפקודה virt-sparsify (שוב, לשים לב שהמכונת VM כבויה) בפורמט הבא:
virt-sparsify image.qcow2 final.qcow2
(שוב, image.qcow2 הוא שם המכונה שלכם כרגע, final.qcow2 זה השם image לאחר ההקטנה).

אם אתם משתמשים ב-Google Compute Engine
במקרה זה מומלץ לעקוב אחר ההוראות כאן כיצד להעלות את ה-image ומה מומלץ שיהיה בו.

אם אתם משתמשים בשרות של אמזון
במקרה של שרות באמזון, לצערי בשלב זה הם אינם מקבלים קבצי qcow2 ולכן תצטרכו להמיר את ה-image שלכם ל-VMDK (ההוראות הן כמו הקישור לעיל, רק שבמקום O qcow2- תצטרכו לכתוב
O vmdk- ).

כעת נוכל להשתמש ב-image שהעלינו כ-Template. מומלץ לשמור את ה-image היכן שהוא ולעדכן אותו אחת לתקופה ולהעלות אותו שוב (לאחר שעבר virt-sysprep) לענן ולהשתמש ב-image החדש כ-template.

5 תגובות בנושא “הכנת VM מבוסס לינוקס לשימוש אצל ספקי ענן”

A few corrections and disagreements:

1. Packer (https://github.com/mitchellh) is implemented in Go
2. I strongly disagree with the recommendation to stay away from Puppet – Puppet can be used from local files without requiring a Puppet Master.

In fact, that's exactly where I'm taking my current workplace, the plan we are working on:
1. Using Vagrant+(VirtualBox | EC2)+Puppet+Serverspec to build and test configurations on user's desktops
2. Build AMI's using Puppet and Packer
3. Moving towards – building Docker images, again using Packer

We might skip the AMI state (2) and jump straight to Docker images.

My points are:
1. Packer is very portable across multiple platforms, including Vagrant. This allows us to reuse our Vagrant+VirtualBox/EC2+Puppet setup for development both now (where we use Puppet Master on EC2) and in the future when we move to Docker images.
2. It's very highly visible = large community and support (and better chance of future survival)
3. Your article seem to focus on initial system setup but the point of automatic deployment tools like Puppet is that you can repeat them multiple times, which is extremely important when you want to verify that a roll-out of an update is going to work and not break something else.

הגב

חץ בן חמו הגיב:

18 באפריל 2015 בשעה 2:56

אוקיי עמוס, ראשית – תודה על התיקונים. תיקנתי את הפוסט.

לגבי מה שהמאמר שלי מתייחס – אמת, הוא יותר מתייחס להתחלה אבל גם להמשך. אכן, ניתן לעבוד בלי Puppet Master ואם כבר השקעת ב-Puppet אז כדאי להמשיך לעבוד איתו במקום לעבוד עם כלי אחר.

יחד עם זאת, אם מישהו משתמש בכלים שמצריכים שרת מרכזי כדי לקבל הוראות עדכונים, הגדרות וכו' – אז אולי כדאי לבדוק את cloud-init.

משהו אחד שלא הבנתי מהתגובה שלך: הרי אתה יכול להרים מכונה ולבחור חבילות ישירות עם Packer מאפס, אז למה אתה צריך Puppet (בסעיף 2 באמצע שכתבת)

הגב
1. Amos Shapira הגיב:
  
  19 באפריל 2015 בשעה 0:14
  
  My workplace has setup a puppetmaster before I joined and it's a headache.
  What happens is that when an EC2 instance is started by the AutoScaling Group, it's busy running puppet and other things for over 5 minutes and even then it could fail on transient problems (unavailable external repositories, network flakiness etc).
  
  I used puppetmaster in previous, non-Cloud, setups but now I'm convinced in the superiority of immutable servers, especially on the cloud.
  In such a setup you create an image and fire it up without running a puppet agent on it (and therefore no need for a puppet master),
  1. no need for a puppet master – easier to maintain multi-region (we have servers in four regions (Sydney, Ireland, Oregon and Singapour).
  2. Faster boot up – obvious. A server can be up and ready in about a minute.
  3. Assurance that what the server runs is exactly what you tested. Particularly once we move to Docker images on top of EC2 (yes I know about the new Elastic Containers but not sure they have an advantage, it's a new product to learn and it's not available in all regions yet).
  4. No issues around wondering when and whether an update rolled out to all your instances and whether this is a cause for a problem you are troubleshooting.
  5. Easier update in terms of being able to run up any version of the image you have in your archive, also means easier testing before roll-out and easier separation of having different versions running in different stages (test, staging, vs. prod). With a puppetmaster it's a chore to keep the different environments separated and yet up to date.
  
  הגב

תודה רבה!

הגב

Sorry – I didn't answer your question at the bottom – we already have a substantial .(though very crappy) Puppet manifest. Packer is not meant to replace Puppet but provide a cross-platform environment to provision images. It still needs "Builders" (one of them is Puppet Builder) to actually put things inside the images it creates.

Now if you were referring to Docker then your question could make more sense –
Perhaps Docker's configuration language (https://docs.docker.com/reference/builder/) will prove strong enough to replace Puppet but right now, before we switch completely to Docker, I want to take advantage of what we already have.

Puppet also has an advantage of being portable and recognised – using it for most of ,the configuration means that we can share Puppet "code" with non-Packer non-Docker environments and across platforms (we have a small number of Windows EC2 instances).

I do not discard even the option of using Puppet to drive Docker, perhaps generating Dockerfile using Puppet templates. I still have to learn about Docker, Packer and how this will fit my purposes before I have a clear idea of how to implement this.

הגב

כתיבת תגובה לבטל

Amos Shapira הגיב:

18 באפריל 2015 בשעה 2:43

A few corrections and disagreements:

1. Packer (https://github.com/mitchellh) is implemented in Go
2. I strongly disagree with the recommendation to stay away from Puppet – Puppet can be used from local files without requiring a Puppet Master.

In fact, that's exactly where I'm taking my current workplace, the plan we are working on:
1. Using Vagrant+(VirtualBox | EC2)+Puppet+Serverspec to build and test configurations on user's desktops
2. Build AMI's using Puppet and Packer
3. Moving towards – building Docker images, again using Packer

We might skip the AMI state (2) and jump straight to Docker images.

My points are:
1. Packer is very portable across multiple platforms, including Vagrant. This allows us to reuse our Vagrant+VirtualBox/EC2+Puppet setup for development both now (where we use Puppet Master on EC2) and in the future when we move to Docker images.
2. It's very highly visible = large community and support (and better chance of future survival)
3. Your article seem to focus on initial system setup but the point of automatic deployment tools like Puppet is that you can repeat them multiple times, which is extremely important when you want to verify that a roll-out of an update is going to work and not break something else.

הגב
1. חץ בן חמו הגיב:
  
  18 באפריל 2015 בשעה 2:56
  
  אוקיי עמוס, ראשית – תודה על התיקונים. תיקנתי את הפוסט.
  
  לגבי מה שהמאמר שלי מתייחס – אמת, הוא יותר מתייחס להתחלה אבל גם להמשך. אכן, ניתן לעבוד בלי Puppet Master ואם כבר השקעת ב-Puppet אז כדאי להמשיך לעבוד איתו במקום לעבוד עם כלי אחר.
  
  יחד עם זאת, אם מישהו משתמש בכלים שמצריכים שרת מרכזי כדי לקבל הוראות עדכונים, הגדרות וכו' – אז אולי כדאי לבדוק את cloud-init.
  
  משהו אחד שלא הבנתי מהתגובה שלך: הרי אתה יכול להרים מכונה ולבחור חבילות ישירות עם Packer מאפס, אז למה אתה צריך Puppet (בסעיף 2 באמצע שכתבת)
  
  הגב
  1. Amos Shapira הגיב:
    
    19 באפריל 2015 בשעה 0:14
    
    My workplace has setup a puppetmaster before I joined and it's a headache.
    What happens is that when an EC2 instance is started by the AutoScaling Group, it's busy running puppet and other things for over 5 minutes and even then it could fail on transient problems (unavailable external repositories, network flakiness etc).
    
    I used puppetmaster in previous, non-Cloud, setups but now I'm convinced in the superiority of immutable servers, especially on the cloud.
    In such a setup you create an image and fire it up without running a puppet agent on it (and therefore no need for a puppet master),
    1. no need for a puppet master – easier to maintain multi-region (we have servers in four regions (Sydney, Ireland, Oregon and Singapour).
    2. Faster boot up – obvious. A server can be up and ready in about a minute.
    3. Assurance that what the server runs is exactly what you tested. Particularly once we move to Docker images on top of EC2 (yes I know about the new Elastic Containers but not sure they have an advantage, it's a new product to learn and it's not available in all regions yet).
    4. No issues around wondering when and whether an update rolled out to all your instances and whether this is a cause for a problem you are troubleshooting.
    5. Easier update in terms of being able to run up any version of the image you have in your archive, also means easier testing before roll-out and easier separation of having different versions running in different stages (test, staging, vs. prod). With a puppetmaster it's a chore to keep the different environments separated and yet up to date.
    
    הגב
נדב קבלרציק הגיב:

18 באפריל 2015 בשעה 9:57

תודה רבה!

הגב
Amos Shapira הגיב:

19 באפריל 2015 בשעה 0:29

Sorry – I didn't answer your question at the bottom – we already have a substantial .(though very crappy) Puppet manifest. Packer is not meant to replace Puppet but provide a cross-platform environment to provision images. It still needs "Builders" (one of them is Puppet Builder) to actually put things inside the images it creates.

Now if you were referring to Docker then your question could make more sense –
Perhaps Docker's configuration language (https://docs.docker.com/reference/builder/) will prove strong enough to replace Puppet but right now, before we switch completely to Docker, I want to take advantage of what we already have.

Puppet also has an advantage of being portable and recognised – using it for most of ,the configuration means that we can share Puppet "code" with non-Packer non-Docker environments and across platforms (we have a small number of Windows EC2 instances).

I do not discard even the option of using Puppet to drive Docker, perhaps generating Dockerfile using Puppet templates. I still have to learn about Docker, Packer and how this will fit my purposes before I have a clear idea of how to implement this.

הגב

אתר זה עושה שימוש באקיזמט למניעת הודעות זבל. לחצו כאן כדי ללמוד איך נתוני התגובה שלכם מעובדים.